[CODE4LIB] Anyone analyzed SirsiDynix Symphony transaction logs?

2015-03-18 Thread William Denton
I'm going to analyze a whack of transaction logs from our Symphony ILS so that 
we can dig into collection usage.  Any of you out there done this?  Because the 
system is so closed and proprietary I understand it's not easy (perhaps 
impossible?) to share code (publicly?), but if you've dug into it I'd be curious 
to know, not just about how you parsed the logs but then what you did with 
it, whether you loaded bits of data into a database, etc.


Looking around, I see a few examples of people using the system's API, but 
that's it.


Bill
--
William Denton ↔  Toronto, Canada ↔  https://www.miskatonic.org/

Re: [CODE4LIB] Anyone analyzed SirsiDynix Symphony transaction logs?

2015-03-19 Thread Michelle Suranofsky
Hi Bill,

I have been working on parsing our logs so we can migrate all of our
historical circ transactions into OLE.  I was recently able to use the data
pulled out of the logs to provide circ counts to our acq department for a
vendor provided spreadsheet of items/isbns (that we had purchased).

After using the Sirsi api to pull all of the charges and renewals out of
the logs I’ve been using java to parse through these text files and insert
the information into a sqlite database (as a ‘staging’ database).  From
there the transactions can be queried (and for me...prepped to migrate).

I would be happy to share my code/process with you.

Michelle
mis...@lehigh.edu

On Wed, Mar 18, 2015 at 5:55 PM, William Denton  wrote:

> I'm going to analyze a whack of transaction logs from our Symphony ILS so
> that we can dig into collection usage.  Any of you out there done this?
> Because the system is so closed and proprietary I understand it's not easy
> (perhaps impossible?) to share code (publicly?), but if you've dug into it
> I'd be curious to know, not just about how you parsed the logs but then
> what you did with it, whether you loaded bits of data into a database, etc.
>
> Looking around, I see a few examples of people using the system's API, but
> that's it.
>
> Bill
> --
> William Denton ↔  Toronto, Canada ↔  https://www.miskatonic.org/


Re: [CODE4LIB] Anyone analyzed SirsiDynix Symphony transaction logs?

2015-03-19 Thread Adam Constabaris
Bill,

If you are talking about parsing Sirsi transaction logs specifically, it's
fairly straightforward to do so with regular expressions and a small amount
of code.  We warehouse data extracted from our logs every night.

If you're talking about working with data retrieved from Sirsi's APIs  more
generally, quite a bit of that can also be done without too much effort.

cheers,

AC

On Thu, Mar 19, 2015 at 9:39 AM, Michelle Suranofsky 
wrote:

> Hi Bill,
>
> I have been working on parsing our logs so we can migrate all of our
> historical circ transactions into OLE.  I was recently able to use the data
> pulled out of the logs to provide circ counts to our acq department for a
> vendor provided spreadsheet of items/isbns (that we had purchased).
>
> After using the Sirsi api to pull all of the charges and renewals out of
> the logs I’ve been using java to parse through these text files and insert
> the information into a sqlite database (as a ‘staging’ database).  From
> there the transactions can be queried (and for me...prepped to migrate).
>
> I would be happy to share my code/process with you.
>
> Michelle
> mis...@lehigh.edu
>
> On Wed, Mar 18, 2015 at 5:55 PM, William Denton  wrote:
>
> > I'm going to analyze a whack of transaction logs from our Symphony ILS so
> > that we can dig into collection usage.  Any of you out there done this?
> > Because the system is so closed and proprietary I understand it's not
> easy
> > (perhaps impossible?) to share code (publicly?), but if you've dug into
> it
> > I'd be curious to know, not just about how you parsed the logs but then
> > what you did with it, whether you loaded bits of data into a database,
> etc.
> >
> > Looking around, I see a few examples of people using the system's API,
> but
> > that's it.
> >
> > Bill
> > --
> > William Denton ↔  Toronto, Canada ↔  https://www.miskatonic.org/
>


Re: [CODE4LIB] Anyone analyzed SirsiDynix Symphony transaction logs?

2015-03-19 Thread Andrew Nisbet
Hi Bill,

I have been doing some work with Symphony logs using Elasticsearch. It is 
simple to install and use, though I recommend Elasticsearch: The Definitive 
Guide (http://shop.oreilly.com/product/0636920028505.do). The main problem is 
the size of the history logs, ours being on the order of 5,000,000 lines per 
month. 

Originally I used a simple python script to load each record. The script broke 
down each line into the command code, then all the data codes, then loaded them 
using curl. This failed initially because Symphony writes extended characters 
to title fields. I then ported the script to python 3.3 which was not 
difficult, and everything loaded fine -- but took more than a to finish a 
month's worth of data. I am now experimenting with Bulk 
(http://www.elastic.co/guide/en/elasticsearch/reference/current/docs-bulk.html) 
to improve performance.

I would certainly be willing to share what I have written if you would like. 
The code is too experimental to post to Github however.

Edmonton Public Library
Andrew Nisbet
ILS Administrator

T: 780.496.4058   F: 780.496.8317

-Original Message-
From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of William 
Denton
Sent: March-18-15 3:55 PM
To: CODE4LIB@LISTSERV.ND.EDU
Subject: [CODE4LIB] Anyone analyzed SirsiDynix Symphony transaction logs?

I'm going to analyze a whack of transaction logs from our Symphony ILS so that 
we can dig into collection usage.  Any of you out there done this?  Because the 
system is so closed and proprietary I understand it's not easy (perhaps
impossible?) to share code (publicly?), but if you've dug into it I'd be 
curious to know, not just about how you parsed the logs but then what you did 
with it, whether you loaded bits of data into a database, etc.

Looking around, I see a few examples of people using the system's API, but 
that's it.

Bill
--
William Denton ↔  Toronto, Canada ↔  https://www.miskatonic.org/


Re: [CODE4LIB] Anyone analyzed SirsiDynix Symphony transaction logs?

2015-03-19 Thread Cary Gordon
Has anyone considered using a NoSQL database to store their logs? With enough 
memory, Redis might be interesting, and it would be fast.

The concept of "too experimental to post to Github" boggles the mind.

Cary


> On Mar 19, 2015, at 9:38 AM, Andrew Nisbet  wrote:
> 
> Hi Bill,
> 
> I have been doing some work with Symphony logs using Elasticsearch. It is 
> simple to install and use, though I recommend Elasticsearch: The Definitive 
> Guide (http://shop.oreilly.com/product/0636920028505.do). The main problem is 
> the size of the history logs, ours being on the order of 5,000,000 lines per 
> month. 
> 
> Originally I used a simple python script to load each record. The script 
> broke down each line into the command code, then all the data codes, then 
> loaded them using curl. This failed initially because Symphony writes 
> extended characters to title fields. I then ported the script to python 3.3 
> which was not difficult, and everything loaded fine -- but took more than a 
> to finish a month's worth of data. I am now experimenting with Bulk 
> (http://www.elastic.co/guide/en/elasticsearch/reference/current/docs-bulk.html)
>  to improve performance.
> 
> I would certainly be willing to share what I have written if you would like. 
> The code is too experimental to post to Github however.
> 
> Edmonton Public Library
> Andrew Nisbet
> ILS Administrator
> 
> T: 780.496.4058   F: 780.496.8317
> 
> -Original Message-
> From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of 
> William Denton
> Sent: March-18-15 3:55 PM
> To: CODE4LIB@LISTSERV.ND.EDU
> Subject: [CODE4LIB] Anyone analyzed SirsiDynix Symphony transaction logs?
> 
> I'm going to analyze a whack of transaction logs from our Symphony ILS so 
> that we can dig into collection usage.  Any of you out there done this?  
> Because the system is so closed and proprietary I understand it's not easy 
> (perhaps
> impossible?) to share code (publicly?), but if you've dug into it I'd be 
> curious to know, not just about how you parsed the logs but then what you did 
> with it, whether you loaded bits of data into a database, etc.
> 
> Looking around, I see a few examples of people using the system's API, but 
> that's it.
> 
> Bill
> --
> William Denton ↔  Toronto, Canada ↔  https://www.miskatonic.org/


Re: [CODE4LIB] Anyone analyzed SirsiDynix Symphony transaction logs?

2015-03-19 Thread Andrew Nisbet
Elasticsearch is a no SQL database 
(http://www.slideshare.net/DmitriBabaev1/elastic-search-moscow-bigdata-cassandra-sept-2013-meetup)
 and much easier to install and manage than Mongo or CouchDB. 

Why 'boggle'? I it's a 'hello world' sketch, no exception guarding, hard coded 
URLs' and other embarrassing no-nos... 

... ok, fine https://github.com/anisbet/hist

Edmonton Public Library
Andrew Nisbet
ILS Administrator

T: 780.496.4058   F: 780.496.8317

-Original Message-
From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Cary 
Gordon
Sent: March-19-15 1:15 PM
To: CODE4LIB@LISTSERV.ND.EDU
Subject: Re: [CODE4LIB] Anyone analyzed SirsiDynix Symphony transaction logs?

Has anyone considered using a NoSQL database to store their logs? With enough 
memory, Redis might be interesting, and it would be fast.

The concept of "too experimental to post to Github" boggles the mind.

Cary


> On Mar 19, 2015, at 9:38 AM, Andrew Nisbet  wrote:
> 
> Hi Bill,
> 
> I have been doing some work with Symphony logs using Elasticsearch. It is 
> simple to install and use, though I recommend Elasticsearch: The Definitive 
> Guide (http://shop.oreilly.com/product/0636920028505.do). The main problem is 
> the size of the history logs, ours being on the order of 5,000,000 lines per 
> month. 
> 
> Originally I used a simple python script to load each record. The script 
> broke down each line into the command code, then all the data codes, then 
> loaded them using curl. This failed initially because Symphony writes 
> extended characters to title fields. I then ported the script to python 3.3 
> which was not difficult, and everything loaded fine -- but took more than a 
> to finish a month's worth of data. I am now experimenting with Bulk 
> (http://www.elastic.co/guide/en/elasticsearch/reference/current/docs-bulk.html)
>  to improve performance.
> 
> I would certainly be willing to share what I have written if you would like. 
> The code is too experimental to post to Github however.
> 
> Edmonton Public Library
> Andrew Nisbet
> ILS Administrator
> 
> T: 780.496.4058   F: 780.496.8317
> 
> -Original Message-
> From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of 
> William Denton
> Sent: March-18-15 3:55 PM
> To: CODE4LIB@LISTSERV.ND.EDU
> Subject: [CODE4LIB] Anyone analyzed SirsiDynix Symphony transaction logs?
> 
> I'm going to analyze a whack of transaction logs from our Symphony ILS so 
> that we can dig into collection usage.  Any of you out there done this?  
> Because the system is so closed and proprietary I understand it's not easy 
> (perhaps
> impossible?) to share code (publicly?), but if you've dug into it I'd be 
> curious to know, not just about how you parsed the logs but then what you did 
> with it, whether you loaded bits of data into a database, etc.
> 
> Looking around, I see a few examples of people using the system's API, but 
> that's it.
> 
> Bill
> --
> William Denton ↔  Toronto, Canada ↔  https://www.miskatonic.org/


Re: [CODE4LIB] Anyone analyzed SirsiDynix Symphony transaction logs?

2015-03-19 Thread Jason Stirnaman
I've been using the ELK (elastic + logstash(1) + kibana)(2) stack for EZProxy 
log analysis.
Yes, the index can grow really fast with log data, so I have to be selective 
about what I store. I'm not familiar with the Symphony log format, but Logstash 
has filters to handle just about any data that you want to parse, including 
multiline. Maybe for some log entries, you don't need to store the full entry 
at all but only a few bits or a single tag?

And because it's Ruby underneath, you can filter using custom Ruby. I use that 
to do LDAP lookups on user names so we can get department and user-type stats.

1. http://logstash.net/
2. https://www.elastic.co/downloads


Jason

Jason Stirnaman, MLS
Application Development, Library and Information Services, IR
University of Kansas Medical Center
jstirna...@kumc.edu
913-588-7319

On Mar 19, 2015, at 2:15 PM, Cary Gordon  wrote:

> Has anyone considered using a NoSQL database to store their logs? With enough 
> memory, Redis might be interesting, and it would be fast.
> 
> The concept of "too experimental to post to Github" boggles the mind.
> 
> Cary
> 
> 
>> On Mar 19, 2015, at 9:38 AM, Andrew Nisbet  wrote:
>> 
>> Hi Bill,
>> 
>> I have been doing some work with Symphony logs using Elasticsearch. It is 
>> simple to install and use, though I recommend Elasticsearch: The Definitive 
>> Guide (http://shop.oreilly.com/product/0636920028505.do). The main problem 
>> is the size of the history logs, ours being on the order of 5,000,000 lines 
>> per month. 
>> 
>> Originally I used a simple python script to load each record. The script 
>> broke down each line into the command code, then all the data codes, then 
>> loaded them using curl. This failed initially because Symphony writes 
>> extended characters to title fields. I then ported the script to python 3.3 
>> which was not difficult, and everything loaded fine -- but took more than a 
>> to finish a month's worth of data. I am now experimenting with Bulk 
>> (http://www.elastic.co/guide/en/elasticsearch/reference/current/docs-bulk.html)
>>  to improve performance.
>> 
>> I would certainly be willing to share what I have written if you would like. 
>> The code is too experimental to post to Github however.
>> 
>> Edmonton Public Library
>> Andrew Nisbet
>> ILS Administrator
>> 
>> T: 780.496.4058   F: 780.496.8317
>> 
>> -Original Message-----
>> From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of 
>> William Denton
>> Sent: March-18-15 3:55 PM
>> To: CODE4LIB@LISTSERV.ND.EDU
>> Subject: [CODE4LIB] Anyone analyzed SirsiDynix Symphony transaction logs?
>> 
>> I'm going to analyze a whack of transaction logs from our Symphony ILS so 
>> that we can dig into collection usage.  Any of you out there done this?  
>> Because the system is so closed and proprietary I understand it's not easy 
>> (perhaps
>> impossible?) to share code (publicly?), but if you've dug into it I'd be 
>> curious to know, not just about how you parsed the logs but then what you 
>> did with it, whether you loaded bits of data into a database, etc.
>> 
>> Looking around, I see a few examples of people using the system's API, but 
>> that's it.
>> 
>> Bill
>> --
>> William Denton ↔  Toronto, Canada ↔  https://www.miskatonic.org/


Re: [CODE4LIB] Anyone analyzed SirsiDynix Symphony transaction logs?

2015-03-20 Thread William Denton

On 19 March 2015, Jason Stirnaman wrote:


I've been using the ELK (elastic + logstash(1) + kibana)(2) stack for EZProxy 
log analysis.


That sounds like a great way to look at the Symphony transaction logs, and I'm 
going to try it.  Thanks!  Having all these logs will a great way to learn these 
applications.


Thanks for all the other replies about the logs.  People are analyzing them in 
different ways with different tools, and everyone finds something right for 
them.  That's always good to see.  I'll write up what I do in case it's useful.


A general question for anyone looking at usage of circulating physical items: 
what have you found to be the most useful questions you ask of the data, or the 
most worthwhile reports and visualizations you make?  I'm going to start with 
some obvious ones (what circs the most, what kind of user is most active, which 
locations are busiest, what call number ranges get used the most) but I wonder 
what others have found to be the most interesting and helpful.


Bill
--
William Denton ↔  Toronto, Canada ↔  https://www.miskatonic.org/

Re: [CODE4LIB] Anyone analyzed SirsiDynix Symphony transaction logs?

2015-03-21 Thread Francis Kayiwa

On 3/19/15 3:53 PM, Jason Stirnaman wrote:

I've been using the ELK (elastic + logstash(1) + kibana)(2) stack for EZProxy 
log analysis.
Yes, the index can grow really fast with log data, so I have to be selective 
about what I store. I'm not familiar with the Symphony log format, but Logstash 
has filters to handle just about any data that you want to parse, including 
multiline. Maybe for some log entries, you don't need to store the full entry 
at all but only a few bits or a single tag?

And because it's Ruby underneath, you can filter using custom Ruby. I use that 
to do LDAP lookups on user names so we can get department and user-type stats.


Hey Jason,

Did you have to create customized grok filters for EZProxy logs format? 
It has been something on my mind and if you've done the work... ;-)


Cheers,

./fxk

--
Your analyst has you mixed up with another patient.  Don't believe a
thing he tells you.


Re: [CODE4LIB] Anyone analyzed SirsiDynix Symphony transaction logs?

2015-03-22 Thread Jason Stirnaman
Francis,

I was able to use Logstash's existing patterns for what I needed.

Depending on how you configure the logging, the format can be identical to 
Apache's.

I may have some custom expressions for query params, but you can also do a lot 
with ES' dynamic fields, which will keep the index smaller.

I have the template on Github, but I'm not sure it's the latest. I'll check and 
post the link.



Jason

-- Original message --
From: Francis Kayiwa
Date: 03/21/2015 8:53 AM
To: CODE4LIB@LISTSERV.ND.EDU;
Subject:Re: [CODE4LIB] Anyone analyzed SirsiDynix Symphony transaction logs?

On 3/19/15 3:53 PM, Jason Stirnaman wrote:
> I've been using the ELK (elastic + logstash(1) + kibana)(2) stack for EZProxy 
> log analysis.
> Yes, the index can grow really fast with log data, so I have to be selective 
> about what I store. I'm not familiar with the Symphony log format, but 
> Logstash has filters to handle just about any data that you want to parse, 
> including multiline. Maybe for some log entries, you don't need to store the 
> full entry at all but only a few bits or a single tag?
>
> And because it's Ruby underneath, you can filter using custom Ruby. I use 
> that to do LDAP lookups on user names so we can get department and user-type 
> stats.

Hey Jason,

Did you have to create customized grok filters for EZProxy logs format?
It has been something on my mind and if you've done the work... ;-)

Cheers,

./fxk

--
Your analyst has you mixed up with another patient.  Don't believe a
thing he tells you.