Hi, Inspired by the blog post http://lucidworks.com/blog/poor-mans-entity-extraction-with-solr/ , I'm using an update request processor chain that calls anupdate-script.js javascript. This script extracts latitude longitude coordinate pairs from the input document content using regular expression matching. The goal is to add the extracted locations to the following field in the Solr index:
<field name="extracted_locations" type="location_rpt" indexed="true" stored="true" multiValued="true"/> This works fine with just one coordinate pair, e.g. "38.1384683,-78.4527887". However, if the input document contains several lat long pairs, I am not able to add these to the multivalued field. Based on information in this article http://lucene.472066.n3.nabble.com/multivalue-location-rpt-field-not-indexing-with-JSON-format-td4065935.html , update-script.js returns an extloc string that e.g. looks like this: ["38.1384683,-78.4527887","58.1384683,-38.4527887","68.1384683,-45.4527887","58.1384683,-38.4527887"] When the script runs the doc.setField operation below: doc = cmd.solrDoc; extloc = extractLocations(content, regex); doc.setField("extracted_locations", extloc); Solr returns the following error message: 128875292 [qtp1778535015-19] INFO org.apache.solr.update.processor.LogUpdateProcessor û [deduplicationtest] webapp=/solr path=/update/extract params={literal.uri=C:\testdata\latlongtest.txt&resource.name=latlongtest.txt&literal.id=file:/C:/testdata/latlongtest.txt&wt=xml&version=2.2&literal.cat=filesys} {} 0 389 128875307 [qtp1778535015-19] ERROR org.apache.solr.core.SolrCore û org.apache.solr.common.SolrException: Couldn't parse shape '["38.1384683,-78.4527887","58.1384683,-38.4527887","68.1384683,-45.4527887","58.1384683,-38.4527887"]' because: For input string: "["38.1384683" at org.apache.solr.schema.AbstractSpatialFieldType.parseShape(AbstractSpatialFieldType.java:175) at org.apache.solr.schema.AbstractSpatialFieldType.createFields(AbstractSpatialFieldType.java:139) at org.apache.solr.update.DocumentBuilder.addField(DocumentBuilder.java:50) at org.apache.solr.update.DocumentBuilder.toDocument(DocumentBuilder.java:125) at org.apache.solr.update.AddUpdateCommand.getLuceneDocument(AddUpdateCommand.java:78) at org.apache.solr.update.DirectUpdateHandler2.addDoc0(DirectUpdateHandler2.java:238) at org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:164) at org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:69) at org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:51) at org.apache.solr.update.processor.DistributedUpdateProcessor.doLocalAdd(DistributedUpdateProcessor.java:867) at org.apache.solr.update.processor.DistributedUpdateProcessor.versionAdd(DistributedUpdateProcessor.java:1021) at org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd(DistributedUpdateProcessor.java:690) at org.apache.solr.update.processor.LogUpdateProcessor.processAdd(LogUpdateProcessorFactory.java:100) at org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:51) at org.apache.solr.update.processor.StatelessScriptUpdateProcessorFactory$ScriptUpdateProcessor.processAdd(StatelessScriptUpdateProcessorFactory.java:375) at org.apache.solr.handler.extraction.ExtractingDocumentLoader.doAdd(ExtractingDocumentLoader.java:121) at org.apache.solr.handler.extraction.ExtractingDocumentLoader.addDoc(ExtractingDocumentLoader.java:126) at org.apache.solr.handler.extraction.ExtractingDocumentLoader.load(ExtractingDocumentLoader.java:228) at org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:74) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135) at org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleRequest(RequestHandlers.java:246) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1967) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:777) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:418) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:207) at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1419) at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:455) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137) at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:557) at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231) at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1075) at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:384) at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193) at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1009) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135) at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255) at org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:154) at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116) at org.eclipse.jetty.server.Server.handle(Server.java:368) at org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:489) at org.eclipse.jetty.server.BlockingHttpConnection.handleRequest(BlockingHttpConnection.java:53) at org.eclipse.jetty.server.AbstractHttpConnection.headerComplete(AbstractHttpConnection.java:942) at org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.header Complete(AbstractHttpConnection.java:1004) at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:636) at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:235) at org.eclipse.jetty.server.BlockingHttpConnection.handle(BlockingHttpConnection.java:72) at org.eclipse.jetty.server.bio.SocketConnector$ConnectorEndPoint.run(SocketConnector.java:264) at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608) at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543) at java.lang.Thread.run(Unknown Source) Caused by: java.lang.NumberFormatException: For input string: "["38.1384683" at sun.misc.FloatingDecimal.readJavaFormatString(Unknown Source) at sun.misc.FloatingDecimal.parseDouble(Unknown Source) at java.lang.Double.parseDouble(Unknown Source) at com.spatial4j.core.io.ParseUtils.parsePointDouble(ParseUtils.java:108) at com.spatial4j.core.io.ParseUtils.parseLatitudeLongitude(ParseUtils.java:145) at com.spatial4j.core.io.ParseUtils.parseLatitudeLongitude(ParseUtils.java:137) at com.spatial4j.core.io.LegacyShapeReadWriterFormat.readLatCommaLonPoint(LegacyShapeReadWriterFormat.java:169) at com.spatial4j.core.io.LegacyShapeReadWriterFormat.readShapeOrNull(LegacyShapeReadWriterFormat.java:153) at org.apache.solr.schema.AbstractSpatialFieldType.parseShape(AbstractSpatialFieldType.java:167) ... 49 more What am I doing wrong? Is it the wt=xml that causes this error? In case so, how can the script call doc.setField with wt=JSON? Alternatively, how should the multivalued input be formatted in xml? Thanks in advance! - e5h5s7 -- http://www.fastmail.com - A fast, anti-spam email service.