PC Games

Orb
Lasagne Monsters
Three Guys Apocalypse
Water Closet
Blob Wars : Attrition
The Legend of Edgar
TBFTSS: The Pandoran War
Three Guys
Blob Wars : Blob and Conquer
Blob Wars : Metal Blob Solid
Project: Starfighter
TANX Squadron

Android Games

DDDDD
Number Blocks
Match 3 Warriors

Tutorials

2D shoot 'em up
2D top-down shooter
2D platform game
Sprite atlas tutorial
Working with TTF fonts
2D adventure game
Widget tutorial
2D shoot 'em up sequel
2D run and gun
Roguelike
SDL 1 tutorials (outdated)

Latest Updates

SDL2 Rogue tutorial
Wed, 29th September 2021

SDL2 Gunner tutorial
Thu, 26th August 2021

SDL2 Shooter 2 tutorial
Tue, 13th July 2021

SDL2 Widget tutorial
Fri, 18th June 2021

SDL2 Adventure tutorial
Tue, 8th June 2021

All Updates »

Tags

android (3)
battle-for-the-solar-system (9)
blob-wars (9)
brexit (1)
code (6)
edgar (6)
games (37)
lasagne-monsters (1)
making-of (5)
match3 (1)
numberblocksonline (1)
orb (2)
site (1)
tanx (4)
three-guys (3)
three-guys-apocalypse (3)
tutorials (8)
water-closet (3)

Books

Scraping Amazon's Wishlist

Sat, 21st May 2016

I wanted to keep track of the prices on my Amazon wishlist, in case something I was after suddenly dipped in price (as they do), and I could grab myself a bargain. I shoved together a little script in PHP, that would extract the items and prices, compared them to the previous prices, and email me if the price dropped. It's not a complicated script and doesn't do anything fancy (like track the original price, etc.)

It will notify if ever a price drops compared to the previous price it recorded. I was going to keep a track of the lowest prices, but I figured that this might be one off, and would only ever notify once. And while this script does notify for any price drop, I only receive one or two emails per day, so it's not spamming me. If you use this script as part of a scheduled job (such as cron) I would recommend not calling it more than once an hour, at most. I've read suggestions that Amazon could blacklist your ip address if it happens to think you're attacking the site. Once an hour is plenty.

Note: this script has only been run against Amazon UK. You may need to modify the $url to suit your own needs.

#!/usr/bin/php

<?php

	$OUT_FILE = "path/to/wishList.json";
	$EMAIL_TO = "example@example.com";
	$WISHLIST_ID = "wishlist.id";
	
	$items = [];
	
	function getItems()
	{
		global $items;
		global $OUT_FILE, $EMAIL_TO, $WISHLIST_ID;
		
		$page = 1;
		$hasMore = false;
		$lastTitle = "";
		$sanity = 0;
		
		do
		{
			$hasMore = false;
		
			$url = "https://www.amazon.co.uk/gp/registry/wishlist/$WISHLIST_ID/ref=cm_wl_sortbar_o_page_2?ie=UTF8&page=$page";
		
			$html = file_get_contents($url);
			
			$regex = '/title="([^"]+)" href="\\/dp/i';
			preg_match_all($regex, $html, $title);
			$title[1] = array_unique($title[1]);
			$titles = array_values($title[1]);
			
			$regex = '/[ ]{2,}(\\£([0-9.]+)|Unavailable)/i';
			preg_match_all($regex, $html, $price);
			$prices = $price[2];
			
			if (count($titles) == count($prices))
			{
				for ($i = 0 ; $i < count($titles) ; $i++)
				{
					$title = "$titles[$i]";
					$price = 0 + $prices[$i];
					
					$items[$title] = $price;
					
					echo "$title = $price\n";
				}
				
				$hasMore = $lastTitle != $titles[0];
				
				$lastTitle = $titles[0];
			}
			
			$page++;
			
			if (++$sanity >= 10)
			{
				$hasMore = false;
			}
			
			sleep(1);
		}
		while ($hasMore);
	}
	
	getItems();
	
	if (file_exists($OUT_FILE))
	{
		$json = json_decode(file_get_contents($OUT_FILE), true);
		
		if (count($items) == 0 && count($json) > 0)
		{
			mail($EMAIL_TO, "Wishlist checker failed", "");
		}
		
		foreach ($items as $title => $price)
		{
			$price = round($price, 2);
		
			$diff = 0;
			
			if (isset($json[$title]))
			{
				$diff = round($price - $json[$title], 2);
			}
			
			if ($price > 0 && $diff < 0)
			{
				$subject = "$title : £$price ($diff)";
				
				mail($EMAIL_TO, $subject, "");
			}
			
			$items[$title] = $price;
		}
	}
	
	$json = json_encode($items, JSON_PRETTY_PRINT);
	
	//mail($EMAIL_TO, "Prices", $json);
	
	file_put_contents($OUT_FILE, $json);

?>

To run this script, copy the code above into a file (or download it at the bottom of this page), update the $OUT_FILE, $WISHLIST_ID, and $EMAIL_TO with actual, valid values, chmod +x the file, and then run it. To wit:

> vi wishList.sh
> chmod +x wishList.sh
> ./wishList.sh

All being well, the wishlist should be scraped, and a JSON file output containing all the items and prices. The code will attempt to scan multiple pages, so you can have more than 25 items (the current limit per page) in your list.

Downloads: wishList.sh (0Kb)
 
code 

Related News

Orb source code
Sun, 25th April 2021

Lasagne Monsters source code
Mon, 19th April 2021

3 Guys Apocalypse source code
Sat, 10th April 2021

A Reddit Wallpaper download script
Thu, 26th May 2016

Resizing an array in C
Wed, 18th May 2016

Comments

Share your comments and thoughts below. All comments are anonymous and cannot be edited.

 

Mobile site