<Stuff about="code" />: Raspberry Pi - Python

Anyway as part of my iPlayer personal podcast solution I needed a way of creating an RSS feed. Python seemed like a logical choice given its a flexible and powerful scripting language and it comes ready on the Pi Debian distro. Even though I had never written a python script before!

Create a podcast from a directory
I chose a really simple implementation, a python program that re-cursed the directory where the media files are stored and created an RSS xml file based on the content.

It only implements the base RSS specification, there are a number of tags for providing additional meta data, particularly for use with iTunes, but for my purposes this wasn't required. For more information about the RSS standard and podcasts see http://www.podcast411.com/howto_1.html.

I also chose to output the xml as strings rather than using an XML parser, simply because it seemed like a significant overhead (mainly in terms of learning how) just to output a simple structure.

This script has only been tested for my requirements and its not really designed to have a tremendous amount of re-use, but feel free to adapt it to your needs.

Python script - createRSSFeed.py

# import libraries
import os
import sys
import datetime
import time

# import constants from stat library
from stat import * # ST_SIZE ST_MTIME

# format date method
def formatDate(dt):
return dt.strftime("%a, %d %b %Y %H:%M:%S +0000")

# get the item/@type based on file extension
def getItemType(fileExtension):
if fileExtension == "aac":
mediaType = "audio/mpeg"
elif fileExtension == "mp4":
mediaType = "video/mpeg"
else:
mediaType = "audio/mpeg"
return mediaType

# constants
# the podcast name
rssTitle = "Podcast title"
# the podcast description
rssDescription = "Podcast description"
# the url where the podcast items will be hosted
rssSiteURL = "http://www.myurl.com/mypodcast"
# the url of the folder where the items will be stored
rssItemURL = rssSiteURL + "/iPlayerRadioDownloads"
# the url to the podcast html file
rssLink = rssSiteURL + "/index.html"
# url to the podcast image
rssImageUrl = rssSiteURL + "/logo.jpg"
# the time to live (in minutes)
rssTtl = "60"
# contact details of the web master
rssWebMaster = "me@me.com"

#record datetime started

now = datetime.datetime.now()

# command line options

# - python createRSFeed.py /path/to/podcast/files /path/to/output/rss

# directory passed in

rootdir = sys.argv[1]

# output RSS filename

outputFilename = sys.argv[2]

# Main program

# open rss file

outputFile = open(outputFilename, "w")

# write rss header

outputFile.write("<?xml version=\"1.0\" encoding=\"UTF-8\" ?>\n")
outputFile.write("<rss version=\"2.0\">\n")
outputFile.write("<channel>\n")
outputFile.write("<title>" + rssTitle + "</title>\n")
outputFile.write("<description>" + rssDescription + "</description>\n")
outputFile.write("<link>" + rssLink + "</link>\n")
outputFile.write("<ttl>" + rssTtl + "</ttl>\n")
outputFile.write("<image><url>" + rssImageUrl + "</url><title>" + rssTitle + "</title><link>" + rssLink + "</link></image>\n")
outputFile.write("<copyright>mart 2012</copyright>\n")
outputFile.write("<lastBuildDate>" + formatDate(now) + "</lastBuildDate>\n")
outputFile.write("<pubDate>" + formatDate(now) + "</pubDate>\n")
outputFile.write("<webMaster>" + rssWebMaster + "</webMaster>\n")

# walk through all files and subfolders

for path, subFolders, files in os.walk(rootdir):

for file in files:

# split the file based on "." we use the first part as the title and the extension to work out the media type
        fileNameBits = file.split(".")
        # get the full path of the file
        fullPath = os.path.join(path, file)
        # get the stats for the file
        fileStat = os.stat(fullPath)
        # find the path relative to the starting folder, e.g. /subFolder/file
        relativePath = fullPath[len(rootdir):]

# write rss item
outputFile.write("<item>\n")
  outputFile.write("<title>" + fileNameBits[0].replace("_", " ") + "</title>\n")
  outputFile.write("<description>A description</description>\n")
  outputFile.write("<link>" + rssItemURL + relativePath + "</link>\n")
  outputFile.write("<guid>" + rssItemURL + relativePath + "</guid>\n")
  outputFile.write("<pubDate>" + formatDate(datetime.datetime.fromtimestamp(fileStat[ST_MTIME])) + "</pubDate>\n")
outputFile.write("<enclosure url=\"" + rssItemURL + relativePath + "\" length=\"" + str(fileStat[ST_SIZE]) + "\" type=\"" + getItemType(fileNameBits[len(fileNameBits)-1]) + "\" />\n")
  outputFile.write("</item>\n")

# write rss footer

outputFile.write("</channel>\n")
outputFile.write("</rss>")
outputFile.close()
print "complete"

Running the script
The script expects 2 parameters:

The path where the media files are stored
The path of the output rss file

python createRSSFeed.py /path/to/media/files /path/to/output/RSSFile.rss

Update - I came across some problems when there was escape characters in the xml, so had to write a function to encode text to make it xml safe.

Update - Dan Goff sent me on a modified version of this program which uses the mutagen library to include data from ID3 tags in mp3 files

7 comments:

Dan21 August 2012 at 15:04
Thanks a lot for sharing this - I have a podcast that I can only get via paid subscription, and thus there's no RSS feed available for it. I used your script to take the downloaded episodes and then auotmagically import them into my music software so I could aggregate them automatically along with the other podcasts I listen to.

I went one step farther and used the Mutagen library to harvest some of the information for each entry into the XML file from the ID3 tags. That might be useful for you or someone else creating your own podcast. Rather than having to manually the information for each episode into the XML file, simply encode it into the ID3 tags for each file and then harvest it automatically.
bryan10 September 2012 at 20:38
i'd love to see this too, could you share with me? itadakiorange at gmail

Note: only a member of this blog may post a comment.

<Stuff about="code" />

Pages

Friday, 8 June 2012

Raspberry Pi - Python - create podcast / RSS

7 comments: