A US-based direct-to-consumer brand sold across three major marketplaces and needed to monitor competitor pricing for their 15,000 product SKUs every two hours. Their previous SaaS tool was missing 30% of price changes due to bot detection failures and had no visibility into third-party seller prices on the same ASINs.
This meant their repricing engine was operating on stale data — causing them to lose sales to competitors who were always a few cents cheaper. The client needed a custom, highly reliable pipeline that could feed into their automated repricing system via REST API with under 30-minute latency.
Platform-Specific Scrapy Spiders
Built dedicated multi-threaded Scrapy spiders for Amazon, Walmart, and eBay. Each spider was tuned to that platform's specific HTML structure, pagination patterns, and anti-bot behavior. Amazon spider used Playwright for product pages; Walmart used a hybrid httpx + Playwright approach for DataDome bypass.
Rotating Residential Proxy Pools
Integrated US-based residential proxy pools (separate pools for Amazon vs Walmart vs eBay). Amazon required premium residential proxies with low block rates. eBay was more tolerant but required cookie session management.
Redis Job Queue with Priority Scheduling
SKUs were queued in Redis with priority tiers: top 1,000 bestsellers scraped every 30 minutes; remaining 14,000 on 2-hour rotation. Failed scrapes were automatically requeued with exponential backoff.
REST API Delivery to Repricing Engine
Built a lightweight FastAPI endpoint that the client's repricing engine polls. API returns current price, seller info, availability, and a change_detected flag so the engine only processes records where price has moved.
- Amazon's bot detection required realistic browser fingerprints and consistent session cookies across requests
- Walmart DataDome protection required Playwright with full JS execution
- eBay frequently changes HTML class names, requiring automated CSS selector health checks
- Managing 15,000 SKUs across 3 platforms with 2-hour SLA required careful rate limit budgeting