Hello all, I am using SOLR-1.4.1 with the DataImportHandler, and I am trying to follow the advice from http://www.mail-archive.com/solr-user@lucene.apache.org/msg11887.html about converting date fields to SortableLong fields for better memory efficiency. However, whenever I try to do this using the DateFormater, I get exceptions when indexing for every row that tries to create my sortable fields.
In my schema.xml, I have the following definitions for the fieldType and dynamicField: <fieldType name="sdate" class="solr.SortableLongField" indexed="true" stored="false" sortMissingLast="true" omitNorms="true" /> <dynamicField name="sort_date_*" type="sdate" stored="false" indexed="true" /> In my dih.xml, I have the following definitions: <dataConfig> <dataSource type="FileDataSource" encoding="UTF-8" /> <entity name="xml_stories" rootEntity="false" dataSource="null" processor="FileListEntityProcessor" fileName="legacy_stories.*\.xml$" recursive="false" baseDir="/usr/local/extracts" newerThan="${dataimporter.xml_stories.last_index_time}" > <entity name="stories" pk="id" dataSource="xml_stories" processor="XPathEntityProcessor" url="${xml_stories.fileAbsolutePath}" forEach="/RECORDS/RECORD" stream="true" transformer="DateFormatTransformer,HTMLStripTransformer,RegexTransformer,TemplateTransformer" onError="continue" > <field column="_modified_date" xpath="/RECORDS/RECORD/pr...@name='R_ModifiedTime']/PVAL" /> <field column="modified_date" sourceColName="_modified_date" dateTimeFormat="yyyy-MM-dd'T'hh:mm:ss'Z'" /> <field column="_df_date_published" xpath="/RECORDS/RECORD/pr...@name='R_StoryDate']/PVAL" /> <field column="df_date_published" sourceColName="_df_date_published" dateTimeFormat="yyyy-MM-dd'T'hh:mm:ss'Z'" /> <field column="sort_date_modified" sourceColName="modified_date" dateTimeFormat="yyyyMMddhhmmss" /> <field column="sort_date_published" sourceColName="df_date_published" dateTimeFormat="yyyyMMddhhmmss" /> </entity> </entity> </document> </dataConfig> The fields in question are in the formats: <RECORDS> <RECORD> <PROP NAME="R_StoryDate"> <PVAL>2001-12-04T00:00:00Z</PVAL> </PROP> <PROP NAME="R_ModifiedTime"> <PVAL>2001-12-04T19:38:01Z</PVAL> </PROP> </RECORD> </RECORDS> The exception that I am receiving is: Oct 15, 2010 6:23:24 PM org.apache.solr.handler.dataimport.DateFormatTransformer transformRow WARNING: Could not parse a Date field java.text.ParseException: Unparseable date: "Wed Nov 28 21:39:05 EST 2007" at java.text.DateFormat.parse(DateFormat.java:337) at org.apache.solr.handler.dataimport.DateFormatTransformer.process(DateFormatTransformer.java:89) at org.apache.solr.handler.dataimport.DateFormatTransformer.transformRow(DateFormatTransformer.java:69) at org.apache.solr.handler.dataimport.EntityProcessorWrapper.applyTransformer(EntityProcessorWrapper.java:195) at org.apache.solr.handler.dataimport.EntityProcessorWrapper.nextRow(EntityProcessorWrapper.java:241) at org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:357) at org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:383) at org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:242) at org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:180) at org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:331) at org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:389) at org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:370) I know that it has to be the SortableLong fields, because if I remove just those two lines from my dih.xml, everything imports as I expect it to. Am I doing something wrong? Mis-using the SortableLong and/or DateTransformer? Is this not supported in my version of SOLR? I'm not very experienced with Java, so digging into the code would be a lost cause for me right now. I was hoping that somebody here might be able to help point me in the right/correct direction. It should be noted that the modified_date and df_date_published fields index just fine (so long as I do it as I've defined above). Thank you, - Ken It looked like something resembling white marble, which was probably what it was: something resembling white marble. -- Douglas Adams, "The Hitchhikers Guide to the Galaxy"