How is Code4Lib Journal indexed? What software is used, and more specifically, 
what characteristics of each article are included in the index?

Our journal is pretty cool, but as a library-related journal, I think it can be 
better. For example, what are the various indexed fields? Maybe we can support 
faceted browsing? Search results are returned in a very narrative form -- a 
format this is not very computable. If search results were in some sort of 
columnar format (TSV, CSV, etc.) sorting and grouping would be possible as well 
as analysis.

Recently, I have been playing a lot with natural language processing and this 
has resulted in the extraction of statistically significant keywords, named 
entities, parts-of-speech, and even the identification of sentences matching a 
given grammar. All of these things lend themselves to inputs for machine 
learning processes. In turn, the results of all these things can 
re-incorporated into an index of Code4Lib. Thus the index not only supports 
find & get but also analysis. For a good time, I'd like to give this a go, just 
as an experiment. 

Is there someplace where I can download a rudimentary metadata file of all 
Code4Lib articles? At the least, I hope such a metadata file includes fields 
such as:

  * author(s)
  * title
  * date
  * abstract
  * link to full text
  * issue

Is there a place where I can get such metadata?

--
Eric Morgan

Reply via email to