Re: Parsing dating during indexing - Year Only
I'm not sure i understand your question ... if you know that you are only ever going to have the 'year' then why not just index the year as an int? a TrieDateField isn't really of any use to you, because normal date type usage (date math, date ranges) are useless because you don't have any real date values (ie: it's ambiguous wether 2007 should match just_the_year:[2006-06-01T00:00:00Z TO 2007-06-01T00:00:00Z]) If you really need a true date field because *most* of your documents have real dates, but only sometimes do you injest documents with only the year, and when you injest documents like this you wnat to assume some fixed month/day/hour/etc... then you can easily do this with update processors ... consider a chain of... RegexReplaceProcessorFactory: just_the_year: ^(\d+)$ - $1-01-01T00:00:00Z CloneFieldUpdateProcessor: just_the_year - real_date_field FirstFieldValueUpdateProcessorFactory: real_date_field (if a doc already had a value in the real field, ignore the new year only value) https://lucene.apache.org/solr/5_2_0/solr-core/org/apache/solr/update/processor/CloneFieldUpdateProcessorFactory.html https://lucene.apache.org/solr/5_2_0/solr-core/org/apache/solr/update/processor/RegexReplaceProcessorFactory.html https://lucene.apache.org/solr/5_2_0/solr-core/org/apache/solr/update/processor/FirstFieldValueUpdateProcessorFactory.html : Date: Fri, 19 Jun 2015 13:57:04 -0700 (MST) : From: levanDev levandev9...@gmail.com : Reply-To: solr-user@lucene.apache.org : To: solr-user@lucene.apache.org : Subject: Parsing dating during indexing - Year Only : : Hello, : : Example csv doc has column 'just_the_year' and value '2010': : : With the Schema API I can tell the indexing process to treat 'just_the_year' : as a date field. : : I know that I can update the solrconfig.xml to correctly parse formats such : as MM/dd/ (which is awesome) but has anyone tried to covert just the : year value to a full date (2010-01-01T00:00:00Z) by updating the : solrconfig.xml? : : I know it's possible to import csv, do the date transformation, export again : and have everything work nicely but it would be cool to reduce the number of : steps involved and use the powerful date processor. : : Thank you, : Levan : : : : -- : View this message in context: http://lucene.472066.n3.nabble.com/Parsing-dating-during-indexing-Year-Only-tp4213045.html : Sent from the Solr - User mailing list archive at Nabble.com. : -Hoss http://www.lucidworks.com/
Parsing dating during indexing - Year Only
Hello, Example csv doc has column 'just_the_year' and value '2010': With the Schema API I can tell the indexing process to treat 'just_the_year' as a date field. I know that I can update the solrconfig.xml to correctly parse formats such as MM/dd/ (which is awesome) but has anyone tried to covert just the year value to a full date (2010-01-01T00:00:00Z) by updating the solrconfig.xml? I know it's possible to import csv, do the date transformation, export again and have everything work nicely but it would be cool to reduce the number of steps involved and use the powerful date processor. Thank you, Levan -- View this message in context: http://lucene.472066.n3.nabble.com/Parsing-dating-during-indexing-Year-Only-tp4213045.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Parsing dating during indexing - Year Only
Hi Chris, Thank you for taking the time to write the detailed response. Very helpful. Dealing with interesting formats in the source data and trying to evaluate various options for our business needs. The second scenario you described (where some values in the date field are just the year) will either come up pretty soon for me or will certainly help someone else dealing with that issue currently. Thank you, Levan -- View this message in context: http://lucene.472066.n3.nabble.com/Parsing-date-during-indexing-Year-Only-tp4213045p4213065.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Parsing dating during indexing - Year Only
Hmm, I can see some things you couldn't do with just using a tint field for the year. Or rather, some things that wouldn't be as convenient But this might help: http://lucene.apache.org/solr/5_2_0/solr-core/org/apache/solr/update/processor/ParseDateFieldUpdateProcessorFactory.html or you can also consider a http://wiki.apache.org/solr/ScriptUpdateProcessor Best, Erick On Fri, Jun 19, 2015 at 1:57 PM, levanDev levandev9...@gmail.com wrote: Hello, Example csv doc has column 'just_the_year' and value '2010': With the Schema API I can tell the indexing process to treat 'just_the_year' as a date field. I know that I can update the solrconfig.xml to correctly parse formats such as MM/dd/ (which is awesome) but has anyone tried to covert just the year value to a full date (2010-01-01T00:00:00Z) by updating the solrconfig.xml? I know it's possible to import csv, do the date transformation, export again and have everything work nicely but it would be cool to reduce the number of steps involved and use the powerful date processor. Thank you, Levan -- View this message in context: http://lucene.472066.n3.nabble.com/Parsing-dating-during-indexing-Year-Only-tp4213045.html Sent from the Solr - User mailing list archive at Nabble.com.