Mike Thomsen created SOLR-9525: ---------------------------------- Summary: split() function for streaming Key: SOLR-9525 URL: https://issues.apache.org/jira/browse/SOLR-9525 Project: Solr Issue Type: Wish Security Level: Public (Default Security Level. Issues are Public) Reporter: Mike Thomsen
This is the original description I posted on solr-user: Read this article and thought it could be interesting as a way to do ingestion: https://dzone.com/articles/solr-streaming-expressions-for-collection-auto-upd-1 Example from the article: daemon(id="12345", runInterval="60000", update(users, batchSize=10, jdbc(connection="jdbc:mysql://localhost/users?user=root&password=solr", sql="SELECT id, name FROM users", sort="id asc", driver="com.mysql.jdbc.Driver") ) What's the best way to handle a multivalue field using this API? Is there a way to tokenize something returned in a database field? Joel Bernstein responded with this: Unfortunately there currently isn't a way to split a field. But this would be nice functionality to add. The approach would be to an add a split operation that would be used by the select() function. It would look like this: select(jdbc(...), split(fieldA, delim=","), ...) This would make a good jira issue. So the TL;DR version is that I need the ability to specify in such a streaming operation certain fields to tokenize into multivalue fields. In one schema I may have to support, there are probably a half a dozen such fields. Perhaps I am missing a feature here, but until this is done it looks like this new capability cannot handle multivalue fields until something like this is in place. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org