|Version 46 (modified by max, 4 years ago)|
- Shepherd FAQ
- About Shepherd?
- Installing Shepherd
- How well is Shepherd working?
- Is there a log file somewhere?
- How do I make Shepherd use a particular grabber?
- Why is Shepherd so slow?
- Can I make Shepherd faster?
- Can I set some default options?
- Can I grab more than 7 days of data?
- Can I specify different configuration files for Shepherd to use?
- My high definition (HD) channels are missing programs?
- How do I uses the new high definition (HD) channels?
- Some of my guide data looks wrong. How can I diagnose the problem?
- Some shows have titles ALL IN CAPS, and often including unwanted info like …
- Can I check Shepherd from within MythTV?
- My MythTV says "mythfilldatabase ran, but did not insert any new data"
- I don't have any guide data in MythTV. How do I test or debug?
- Shepherd works when I run it from the command-line, but not automatically …
- Shepherd is running hourly on my system. Shouldn't it run once per day?
- How can I prevent mythfilldatabase adding unwanted channels to my video …
- When mythfilldatabase runs (SVN release of mythtv, mythtv 0.21 and later), …
- My guide data is listed against the wrong channels inside MythTV.
- Which timezone should I set MythTV to? Auto, None, +1000?
Here you can find answers to some of the most frequently asked questions about Shepherd... If you have a question not answered on this page, feel free to add the question into this page yourself (with no answer) - or, better yet, post the question to the mailing-list and then edit the question and answer into this Wiki page!
What is Shepherd?
Shepherd is an attempt to reconcile many different tv_grab_au scripts and make one cohesive reliable data set.
How does it work?
It works by calling a series of scripts that grab data from a large variety of sources, and then analysing the resulting XML data sets and determining which of the many is the most reliable. Postprocessors are used to augment the data sets with additional information (e.g. movie information from http://www.imdb.com, HDTV programming from http://www.dba.org.au etc.).
When switching between data sources, Shepherd's reconciler also tries to ensure that programme names are consistent. e.g. if you're used to recording a programme called "House" yet a different data source names it as "House, M.D.", Shepherd is smart enough to remember the original name and substitute it. No configuration is necessary to enable this; it happens automatically.
Shepherd is designed to be future proof, never requiring manual intervention once initially installed and configured. Shepherd will automatically update itself with fixes, enhancements and additional plugin components as and when they become available.
The shepherd_logic wiki page contains a more complete technical description of the various stages of Shepherd and how it works.
How can I get help or ask a few questions?
Feel free to join our mailing list by sending an email with "subscribe shepherd <email>" in the body to majordomo@…. Once you've joined, you can post to the list by emailing shepherd@….
Note that there are no archives of the mailing list. If your question is not answered on this site, ask away...
Can I contribute?
Absolutely. Shepherd is a community project and is the result of countless contributors. If you wish to enhance some functionality within Shepherd (e.g. write a new postprocessor), implement some new fancy reconciling logic, implement a new grabber or just help out in answering questions or contributing to Wiki documentation, feel free to help!
Which operating systems does Shepherd support?
In theory, Shepherd (and its underlying components) will run on any operating system that supports Perl, as all scripts are currently written in Perl.
In practice, the developers all use Linux and MythTV, and that is what is known to work.
No effort has been put into making Shepherd work under Microsoft Windows or Windows Media Centre Edition, although it really shouldn't be too hard to get that working if anyone was motivated enough to do so.
Is Shepherd legal?
Some of the grabbers used by Shepherd read web sites that say they don't want their data used in PVRs, but that doesn't mean it's illegal. Shepherd doesn't copy or distribute data, but rather allows individuals at home to read it via their PVRs. It operates in the same manner as a browser, sending HTTP requests and formatting the resultant HTML for display in a manner appropriate to the user.
How do I install Shepherd?
See the Installation page.
How important is it to install the optional Perl modules?
Some of Shepherd's grabbers require additional Perl modules to be installed, without which they won't function. They are listed as "optional" because Shepherd does not rely on any individual grabber to do its job; instead it draws on as many or as few of its available grabbers as necessary to acquire guide data for the time period and channels you want.
Sometimes Shepherd can do this with a single grabber. More commonly, it employees multiple grabbers and combines their results.
Generally speaking, Shepherd can perform very well even if some of its grabbers are disabled or unsupported (i.e. missing modules). However, it will probably perform more efficiently, reliably, and possibly more accurately if you can enable all of its grabbers.
How well is Shepherd working?
A summary of Shepherd's performance can be viewed by:
In particular, note the last line, which tells you the percentage of wanted data it acquired. If it's less than 100.00%, Shepherd wasn't able to completely data for all your channels over the next 7 days. If you haven't enabled all of Shepherd's grabbers, you will probably benefit from doing this.
If Shepherd is grabbing 100% of wanted data, then enabling additional grabbers may be unnecessary. However, doing so will still improve Shepherd's ability to tolerate a grabber failure, may allow it to run faster and use less bandwidth, and may improve its data quality.
Is there a log file somewhere?
Yes, in the log/ subdirectory of your Shepherd installation (usually ~/.shepherd/log/).
How do I make Shepherd use a particular grabber?
Generally it's best to let Shepherd decide which grabbers to use, and in which order. One of its main benefits is its fault-tolerance: guide data sources tend to be fragile and can stop working unexpectedly, but Shepherd will work regardless. Relying on a particular grabber always being there is, unfortunately, not safe.
If you have a general preference for speed over quality, or vice versa, you can control this via the "--mode" option.
To specify an exact order for grabbers, use the "--grabwith <grabber/s>" option. Shepherd will run the specified grabber(s) first, then others as needed to fill remaining holes in the data. For example:
~/.shepherd/tv_grab_au --grabwith oztivo,sbsnews_website
Why is Shepherd so slow?
Shepherd spaces out its downloads to avoid overloading its data source web sites. Most of the time Shepherd is running, it's not doing anything at all: it's simply pausing between downloads. Many times in the past, excessive traffic has prompted online guides to take measures to block grabbers, and its essential for the benefit of all users that Shepherd play nice.
Since Shepherd runs in the background, how long it takes makes no difference. If you'd like it to complete faster, though, you may use the "--mode" option.
Can I make Shepherd faster?
By default Shepherd operates in "Quality" mode, whereby it selects grabbers based on maximizing data quantity and quality. This mode can take a long time: around 15-30 minutes under normal operating conditions, and several hours if it's being run for the first time, with no cache and needing to fill a full week.
For fastest operation, you can run Shepherd in "Speed" mode, which will make it select grabbers that complete fastest, even if their data quality is not the highest. "Efficiency" mode is a balance between the two.
In all modes, Shepherd will employ as many grabbers as necessary to completely fill the next 8 days with data.
To run Shepherd once in a particular mode:
~/.shepherd/shepherd --mode speed
or to permanently set Shepherd in a particular mode:
~/.shepherd/shepherd --component-set shepherd:mode=speed
Can I set some default options?
If you want Shepherd to always be called as if it was sent a particular command-line option, you can use:
~/.shepherd/tv_grab_au --component-set shepherd:<option>
For example, this would make Shepherd always run as if called with the option "--grabwith=abc_website":
~/.shepherd/tv_grab_au --component-set shepherd:grabwith=abc_website
If you want to add multiple options they all need to be set with one command otherwise the final command will override any previously set commands. For example to add both "--notquiet" and "--grabwith=abc_website" :
~/.shepherd/tv_grab_au --component-set shepherd:notquiet:grabwith=abc_website
You can also set default options for any component, e.g.:
~/.shepherd/tv_grab_au --component-set abc_website:do-extra-days
To clear all default options, call --component-set with no argument. E.g.:
~/.shepherd/tv_grab_au --component-set shepherd
Can I grab more than 7 days of data?
(Note: the Shepherd default is now eight days.)
You can ask Shepherd to try to grab however many days you like. Some channels in some regions offer up to 28 days of data; others as few as three or four. Generally, you can get at least 7 days for all but the community channels, and 14+ days for ABC and SBS.
The main reason you may not want more days is consistency. If you receive different numbers of days of guide data for different channels--which tends to happen once you push above 7 or 8 days--it's harder to spot new shows. The default of 8 days tends to keep your channels in sync, while also letting you see if the show you just recorded is also on next week.
To specify the number of days for one particular Shepherd run:
~/.shepherd/shepherd --days <n>
To set this as the default:
~/.shepherd/shepherd --component-set shepherd:days=<n>
Can I specify different configuration files for Shepherd to use?
No. Shepherd always expects to use the same configuration file (usually ~/.shepherd/shepherd.conf).
You can use tricks like changing the environment variable HOME, to run multiple installs of shepherd. But it's not very efficient, in that it will lead to a fair bit of redundant downloading, but otherwise it is a good solution for someone wanting data from multiple regions.
HOME=/first/directory /first/directory/.shepherd/shepherd HOME=/second/directory /second/directory/.shepherd/shepherd
Another similar way is to 'mv' the .shepherd directory.
mv ~/.shepherd ~/.shepherd.1 mv ~/.shepherd.2 ~/.shepherd ~/.shepherd/shepherd mv ~/.shepherd ~/.shepherd.2 mv ~/.shepherd.1 ~/.shepherd ~/.shepherd/shepherd
My high definition (HD) channels are missing programs?
You must configure standard definition (SD) channels for the corresponding HD channels for them to be populated correctly.
It is now possible to obtain fully populated HD channels with KNOWN HD programs flagged as HD. This is the DEFAULT for new installs. To enable on old installs, go to your shepherd install and execute:
Originally the HD channels were only populated with KNOWN HD programs. The idea is to have both SD and HD, and increase the priority of HD channels, so programs record as HD when available. To change to this behaviour use these commands:
cd ~/.shepherd/postprocessors/flag_aus_hdtv ln -s ../../references/Shepherd ./flag_aus_hdtv --set=action:copy rm Shepherd
WARNING: Some stations do upscaling from SD to HD; you could record the HD version but the SD version is at least half the size for the same detail. Also some stations run a program of scenery in a loop; if you recorded their HD channel you would miss your program.
Another way to obtain fully populated HD channels, is use the SD xmlids for the HD channels and remove your HD channels from shepherd BUT you will miss any SD channel and HD channel divergence.
If any additional programs should be flagged HD, please let us known on our mail list.
How do I uses the new high definition (HD) channels?
Due to problems with our data sources, XMLTV (0.5.50 and before) and MythTV (0.20 and before) please follow the directions on the HDTV page.
To make use of the HD flag, MythTV requires the setting of priorities. In mythfrontend transverse the menus Utilities/Setup -> Setup-> TV Settings ->Recording Priorities -> Set Recording, and set HDTV Recording Priority = 2. I also recommend enabling the Reschedule Higher Priorities option. Then transverse Next -> Finish -> Channel Priorities, and select each of your HD channels and press left arrow, to decrease to -1. Cancel exits. This should tell MythTV to record flagged HD programs on a HD channel and non-flagged HD programs on a SD channel.
Some of my guide data looks wrong. How can I diagnose the problem?
Because Shepherd employs many different grabbers, the first step is to figure out where the dodgy data came from. If you're interested in a particular time, you can use the "--ancestry" option to see how Shepherd put together guide data for a particular time. For example, to look at the ancestry of data for next Tuesday from 10:30pm - 11pm:
~/.shepherd/tv_grab_au --ancestry "tuesday 10:30pm+30"
This will print out relevant guide data obtained during Shepherd's last successful run from each component. What you want to do is find the earliest point at which the timestamps are wrong.
- If the data looks wrong in a grabber, it's either a problem with the grabber itself or the grabber's data source.
- If the data looks fine in the grabber, but is bad in the output of a reconciler, postprocessor, or Shepherd itself, it's a Shepherd problem.
Either way, armed with this information you should be able to get further help from the mailing list.
The --ancestry option requires the Perl module File::Find::Rule. You can install it with the command:
sudo cpan File::Find::Rule
or, for Debian-based distributions:
sudo apt-get install libfile-find-rule-perl
Some shows have titles ALL IN CAPS, and often including unwanted info like "LIVE:" or "SEASON FINALE."
You have EIT enabled in MythTV, which is overriding the guide data supplied by Shepherd. See the MythTV EIT page for instructions on how to disable EIT. Note that EIT can be enabled on a global or per-channel basis, so ensure you turn it off everywhere.
Can I check Shepherd from within MythTV?
Yes, via Information Center -> System Status -> Listings Status.
My MythTV says "mythfilldatabase ran, but did not insert any new data"
That's fine, so long as it also says you have guide data for at least the next few days. Shepherd runs hourly by default, but only updates MythTV once a day. So most of the time, your most recent run won't have inserted new data.
If, however, you are seeing a "FAILED" message, or you are running low on guide data, something may be wrong.
I don't have any guide data in MythTV. How do I test or debug?
If you've just finished installing, Shepherd may be running right now. It takes a while, especially the first time. Check MythTV's Information Center, as described above. Or you can use the command line: ps cax | egrep "(mythfill|tv_grab_au|shepherd)".
To establish whether Shepherd has run at all, look in ~/.shepherd/log/. Each time Shepherd runs, it creates a new shepherd.log, and archives previous ones (e.g. shepherd.log.1.gz). If this directory is empty, Shepherd has never been run. If there are log files, reading the latest one or two may tell you what, if anything, went wrong.
Other useful commands to check Shepherd's basic health:
~/.shepherd/shepherd --status ~/.shepherd/shepherd --history ~/.shepherd/shepherd --check
If Shepherd seems OK, the issue may be that data is not getting into MythTV. See below.
Shepherd works when I run it from the command-line, but not automatically via mythfilldatabase.
Most problems relating to integration between Shepherd and MythTV can be solved with:
Shepherd will run within a few minutes of you doing this.
By default, mythfilldatabase suppresses Shepherd's output. You can run it manually and see what's actually going on like this:
mythfilldatabase --graboptions '--daily --notquiet'
Note that Shepherd's normal behavior is to prevent itself from running too frequently, in order to avoid overtaxing guide data resources. This is not an error.
Shepherd is running hourly on my system. Shouldn't it run once per day?
A default installation will setup an hourly cron job with the "--daily" option. This means Shepherd will be called once per hour, but only download new guide data if it's been about a day since the last successful grab. Otherwise, it simply exits.
If you prefer, you can edit your crontab to run Shepherd once per day at a time of your choice. However, the default install has two advantages over this method:
- It works even if your system backend powers down when idle. A daily cron job, by contrast, requires your machine be on at that time every day.
- If Shepherd fails for some reason (e.g. a temporary network problem), it will try again in an hour, rather than waiting a whole day.
How can I prevent mythfilldatabase adding unwanted channels to my video sources?
Make sure mythfilldatabase (not Shepherd) is invoked with the "--update" option, so it will not add any missing channels to your video sources. (This can be an issue if you have video sources that receive different sets of channels, for example Free-to-Air TV and Pay TV.)
If you have setup Shepherd the default way, you can add the "--update" option to mythfilldatabase with:
Look for "mythfilldatabase" and insert "--update" immediately after it, so the line looks something like this:
44 * * * * mythfilldatabase --update --graboptions "--daily"
When mythfilldatabase runs (SVN release of mythtv, mythtv 0.21 and later), I can't see what Shepherd is doing
Starting with MythTV 0.21 and SVN release around February 2007, mythfilldatabase (by default) adds '--quiet' onto the command-line when calling the tv_grab_au script. To negate this, you can use Shepherd's "--notquiet" option. You can make this a default by:
~/.shepherd/tv_grab_au --component-set shepherd:notquiet
To disable this, use:
~/.shepherd/tv_grab_au --component-set shepherd
My guide data is listed against the wrong channels inside MythTV.
This will try to compare your Shepherd channels to your MythTV channels:
If it doesn't look right, re-configure:
If Shepherd is not able to access your MythTV (e.g. it's a remote backend), you will need to manually check that the XMLTV IDs you assigned to channels in Shepherd match those you assigned in MythTV. It doesn't matter what each XMLTV ID is, just that they match. For example, if in the mythtv-setup Channel Editor you have "ABC Melbourne" set to an XMLTV ID of "abc.free.au", you should also specify this as the XMLTV ID for the ABC channel in Shepherd. You can do this via --configure, as shown above.
Which timezone should I set MythTV to? Auto, None, +1000?
Usually it doesn't matter. Shepherd will look up MythTV's timezone setting, compare this to your system clock, and ensure everything lines up.
The ideal timezone, though (the one that requires no extra adjustments from Shepherd) is "None." If Shepherd is not on the same box as your MythTV, you must set your timezone to "None."
Shepherd assumes that your machine's timezone has been set correctly for the region you've selected. To check this is the case, make sure these commands both display the correct time and timezone for the region you're using Shepherd for:
perl -e 'use POSIX; print POSIX::strftime("%z %x %X %Z\n", localtime(time));' date "+%z %x %X %Z"
If you need to set your machine's timezone, Debian-based distributions can use the 'tzconfig' command (run as root). The TZ environment variable could also be set to correct any differences. Add in ~/.profile (to make it user specific), /etc/profile (to make it machine wide), or just before you execute shepherd. eg:
TZ="Australia/Brisbane" export TZ
You will need to log off and back in to pick up the change.
Note: A common timezone problem (unrelated to Shepherd) is when times are out in MythWeb, but correct in the rest of MythTV. This is because PHP5, unlike PHP4, requires you to explicitly set a timezone in php.ini:
In a terminal, type:
sudo gedit /etc/php5/apache2/php.ini
Within gedit, Search > Go to line > 603. Change the line from:
date.timezone = Australia/Adelaide
... or whatever is appropriate for your location. Make sure to remove the semi-colon (comment)!
Save, exit gedit, and restart apache:
sudo /etc/init.d/apache2 restart
Your MythWeb times should now match what is displayed within MythTV.