TeCoEd (Teaching Computing Education)
  • Home
    • Freelance
    • Book
    • Downloading
  • Python
    • Learn Python >
      • Python Modules
    • PyGame Zero
    • Python Programs >
      • Higher or Lower
      • Magic Calculator
      • Password Checker
      • Python Pit
    • What's News App
    • Pixels to Cells
    • Python Mosaics
    • Python OCR
    • L-1-AM
    • Web Scraping >
      • Scraping Trains
    • Weather App
    • Snakes and Windows
    • Python Web Server >
      • Flask
    • Python Picks
  • Ras Pi
    • All About the Pi
    • Getting Started
    • Remote Desktop and VNC
    • Static IP Address
    • Sonic Pi >
      • 3.14
    • Twitter Feed >
      • Tweepy
    • Android & Pi >
      • Advanced Apps
      • Odds
    • A.I on the the Pi
    • CRON
    • Pick Your Own
  • Pi Hardware
    • Pi HATS >
      • Sense Hat Hacks
      • AstroPi HAT
      • Unicorn-HAT >
        • Unicorn Alphabet Disco
        • Uni Codes / Programs
      • Skywriter
      • Piano HAT
    • STS Pi
    • Pi Camera >
      • Pi-Cam, Python & Email >
        • Time Lapse
      • Pi Noir
    • Pipsta >
      • Flask, Input & Printers
    • Raspberry Pi Power >
      • Energenie IR power
    • Pibrella
    • Distance Sensor
    • LCD Screen
    • Pi-Tooth
    • Robot Arm
    • PiGlow
    • PiFM
    • Accelerometer
    • PiFace >
      • Installing PiFace >
        • Python Commands
  • Pi-Hacks
    • Drone Hacks
    • Pi Glue Gun Hack
    • Blinkt!
    • Sonic Pixels
    • R2D2
    • Get to the chopper
    • Astro Bird
    • Twitter Translator
    • Hacking a Robot
    • Nature_Box >
      • Best Nature Photos
    • Wearable Tech >
      • Project New York
      • P.N.Y Part 2 Health
      • P.N.Y Part 3 Games
      • P.N.Y Part 4 Translation
    • Dino-Tweet
    • Other Links
  • Pi-Hacks 2
    • The Joker
    • Hologram Machine
    • Google Vision: Camera Tell
    • Yoda Tweets
    • Pi Phone
    • Darth Beats
    • Twitter Keyword Finder
    • Crimbo Lights Hack
    • Xmas Elf
    • Halloween 2016
    • Halloween Hack 2015
    • Socrative Zombie
    • Voice Translation
    • The Blue-Who Finder
    • GPIO, Twitter
    • Pi Chat Bot >
      • Dictionary Definitions
    • PiGlow & Email
    • Pibrella Alarm System
    • SMS with Python >
      • Spooking a Mobile
  • Pi-Hacks 3
    • LED Dance Suit
    • Ferminal
    • Crypto Tracker
    • David Bowie
    • Lamp Prank >
      • TEST
    • Yoda FM
    • Retro Player
    • LED Pixel Art
    • TARDIS
    • Battleships
    • LED Board
    • Night Vision
    • Enviro+ Weather
  • Minecraft
    • Minecraft API
    • Minecraft Sweeper
    • PiGlove: Minecraft Power Up
    • Minecraft Photo-booth
    • Rendering Pixels
    • Speed Cube
    • Lucky Dip
  • Computing
    • Why Computing?
    • Can You Compute
    • micro:bit
    • Coding Resources
    • Learn to Code >
      • Coding with iPads
      • Apps Creation Tools
      • sKratchInn
      • Sound Editing
    • Cheat Sheets
    • Theory
    • HOUR OF CODING
    • BEBRAS Computing Challange
    • Computer Facts
    • Free Software and Links
  • Contact Me
  • Random Hacks
    • Movile

What is it?


We have probably all checked the train times or forgotten them or misread them and missed a train to then be left searching for the next available one.  There are many  websites, apps and timetables but these, although quick still require you to unlock your phone, load the app and enter the stations.  This can take some time. If you make the same journey each day between the same stations can you imagine how useful a simple program could be that when run loaded the next time for the next train to your destination? Let's scrape some train times! 

The Basic Plan


The original idea and concept was brought to my attention by txt3rob for a section of my Project New York.  However, the whole function and parts of the solution are smart and create a useful scraping tool, Big up to @SmspiUk for the support.

The solution has several parts but in essence is simple:
  1. Use National Rail website to locate the stations and times
  2. Edit / Copy the web URL
  3. Open a new Google Sheets
  4. Paste the HTML code in
  5. This will load the train timetable data
  6. Download the Google Sheet as a CSV file via Chrome
  7. In the downloads copy the download URL
  8. Remove all the url after gid=0
  9. Add @output=csv to the end of the URL
  10. Create the Python Code
  11. Run!

Parts 1 and 2


The first part is to go to National Rail and enter the details for the two stations that you wish to travel between.  Press enter and this wil load up the train times.   The URL in the address bar is basically the data link that you need but it will need editing to make it fully functional.  The HTML is standard for all searches so you can copy the code below and change the station abbreviations. A full list of these can be found here, check the abbreviations as some are not obvious, for example London Kings Cross is KGX not LKG!
Picture
HTML link
File Size: 11 kb
File Type: docx
Download File


Parts 3 to 5


Next sign up for Google Drive or log in if you already have an account.  Create a New Google Sheet and save it.  Then basically copy your HTML code and past it into the first Cell A1, this will then begin downloading the train time data to the Sheet.  By reversing the station abbreviations you can also then copy and paste the return times into the same sheet.
Picture

Trains in Action



Parts 6 and 9


This part is really simple, basically (ensure that you are using Chrome) once the data has loaded into the Sheet then right click and download the file as a CSV file. At the bottom of the screen is the download link, click this opening the download folder.  Find the downloaded train CSV file and right click the download URL.  This will be used later in the Python program.  However, there is one final edit to make to this URL, everything after format=CSV needs deleting, shown below in red.

Before:
https://docs.google.com/spreadsheets/d/1D9j8t6qMdAWFE2OfqcTXMDiB9e4oowXgS2ORuPNn-ek/export?format=csv&id=1D9j8t6qMdAWFE2OfqcTXMDiB9e4oowXgS2ORuPNn-ek&gid=0

After:
https://docs.google.com/spreadsheets/d/1D9j8t6qMdAWFE2OfqcTXMDiB9e4oowXgS2ORuPNn-ek/export?format=csv&id=1D9j8t6qMdAWFE2OfqcTXMDiB9e4oowXgS2ORuPNn-ek&gid=0

Part 10 The Python Code


The last part of the solution is to create the Python program that will scrape the file at the end of the URL link for the train data and return it to the CSV file.  First you need to create the CSV file in your folder or the same location where the Python program will run.  In the example below the CSV file is called trains.csv.  The program makes use of the url library, urllib, to open the download url that your created in the previous steps, them code that opens the file, writes it and finally closes the CSV file.  A time delay of 60 seconds has been added to ensure that the data can be downloaded to the file.   

import os
import urllib

import time
web_file = urllib.urlopen("ENTER YOUR URL FROM PART 6 & 9 HERE" ")
out_file = open('trains.csv', 'w')time.sleep(50)
out_file.write(web_file.read())

out_file.close()
time.sleep(10)
print "files written"


Remember to enter your URL in line three, urlopen(" add yours here), save the program and run in the LX Terminal.
Example Code
File Size: 0 kb
File Type: py
Download File

Powered by Create your own unique website with customizable templates.