Posts Tagged ‘ BASH

Fun with CURL Scripts


Waltham Curling Club - Triumph, IL by Daveblog

Waltham Curling Club - Triumph, IL by Daveblog

I don’t often get to play with scripts at work – most of my time is spent in meetings or with spreadsheets. So it was rare treat to do some basic scripting with CURL; my developers do so love it when I give them a helping hand.

We use CURL in a lot of our projects through the PHP CURL library; it’s a very powerful way of manipulating URIs, posting data and managing cookies. There’s an option for almost anything you could ever want to do with a web resource. The project in question has an API that is defined as set of web resources which accept and send XML files; you can either POST and XML file or GET one. Usually a remote system will POST a file once or twice a day. As part of the set up for a new customer, though, we needed to POST several hundred files. This is the sort of task that UNIX excels at; its ability to combine any available tool into a script with the minimum amount of fuss is a real joy.

CURL

The basic CURL command I used is:

curl –silent –write-out “%{http_code}” –basic –user <username>:<password> –data @<filename> –header “Content-Type:application/<application-type>;charset=UTF-8”  <uri>

To run through the arguments in order:

–silent means that CURL’s usually output of the fetched page and a progress indicator won’t be displayed

write-out “%{http_code}” sets the output of the command to be the http header response code. This is one of many variables that can be used to format CURL output. The script I’m using only cares about success (header response 204) or not success.

–basic sets the HTTP authentication mechanism to basic authentication (i.e. a username and password)

–user sets the username and password. The format of the argument is <username>:<password>

–data has as it’s argument data to be posted to the URL. If the argument begins with @ it is assumed to be a filename. So in this case we’re posting the data in <file>

–header enables you to send custom header fields. Our application uses the content-type header to decide which XML schema to validate the posted data against. Note that the header should not include LF/CR but should be wrapped in quotes.

The final argument is the URI I’m posting to. This is without the “http” or “https” ie. “api.domain.name” not “http://api.domain.name”

BASH

The CURL command above will POST one file to the API, wrapping the command in a bit of BASH will allow me to run any number of uploads. I’m not particulalry concerned by how long these take to run and I don’t want to overload the backend database so I chose to upload these files serially rather than in parallel.

To keep things simple, I have one directory full of files to process and one directory where I’m going to keep completed files. These directories are called “todo” and “done”. Files that cause an error are kept in “todo” for manual examination.

The first thing we need to do is put the return value of the CURL command into a bash variable: wrapping the command in back ticks and will achieve this. $f is a variable containing the file name of the XML to be POSTed.

RETURNVALUE=`curl –silent –write-out “%{http_code}” –basic –user <username>:<password> –data @$F –header “Content-Type:application/<application-type>;charset=UTF-8”  <uri>`

So if the uploaded XML was processed correctly then returnValue is 204 otherwise it is something else.  In BASH:

if [ “$RETURNVALUE” = “204” ]

then

mv $F done

fi

Note that $RETURNVALUE and 204 are wrapped in double quotes. This forces interpretation as strings. If you were to compare both sides as integers you would use

[  $RETURNVALUE -eq 204 ]

If you have an integer on one side of the equals and a string on the other side BASH will throw the helpfull error “Unary operator expected”.

Now we loop through all the files in the todo directory using a simple for loop.

First set FILES to be all the files we want to read using a regular expression:

FILES=”todo/*.xml”

This will assign to FILES all the files in the todo directory that finish with xml.

To access each file indiviudally we use a for loop

for F in $FILES

do

echo “Processing $F”

done

The code above trivially prints out the name of each file we want to process. Combine this with the CURL and if statement above we have a working script that will POST all the files in the todo directory and move completed files to the done directory:

for F in $FILES

do

echo “Processing $F”

  RETURNVALUE=`curl --silent --write-out "%{http_code}" --basic --user <username>:<password> --data @$F --header "Content-Type:application/<application-type>;charset=UTF-8"  <uri>`
if [ "$RETURNVALUE" = "204" ]

 

 then

 

   mv $F done

 

fi

done

Of course if this were a production script we’d parametrise the name of the directories and add some error trapping. I left that as an exercise to one of my junior programmers.