It did not work, I tried many things and ended up trying this: <requestHandler name="/dataimport" initParams="myInitParams" class="solr.DataImportHandler"> <lst name="defaults"> <str name="config">solr-data-config.xml</str> </lst> </requestHandler> <initParams name="myInitParams" path="/update/**,/dataimport"> <lst name="defaults"> <str name="update.chain">add-unknown-fields-to-the-schema</str> </lst> </initParams>
Regards, Pierre > On 10 Aug 2016, at 18:08, Alexandre Rafalovitch <arafa...@gmail.com> wrote: > > Your initParams section does not apply to /dataimport handler as > defined. Try modifying it to say: > path="/update/**,/dataimport" > > Hopefully, that's all that takes. > > Managed schema is enabled by default, but schemaless mode is the next > layer on top. With managed schema, you can use the API to add your > fields (or new Admin UI in the Schema screen). With schemaless mode, > it tries to guess the field type as it adds it automatically. > > > Regards, > Alex. > > ---- > Newsletter and resources for Solr beginners and intermediates: > http://www.solr-start.com/ > > > On 10 August 2016 at 18:04, Pierre Caserta <pierre.case...@gmail.com> wrote: >> Hi Alex, >> thanks for your answer. >> >> Yes my solrconfig.xml contains the add-unknown-fields-to-the-schema. >> >> <initParams path="/update/**"> >> <lst name="defaults"> >> <str name="update.chain">add-unknown-fields-to-the-schema</str> >> </lst> >> </initParams> >> >> I created my core using this command: >> >> curl >> http://192.168.99.100:8999/solr/admin/cores?action=CREATE&name=solrexchange&instanceDir=/opt/solr/server/solr/solrexchange&configSet=data_driven_schema_configs_custom >> >> I am using the example configset data_driven_schema_configs and I simply >> added: >> >> <lib dir="${solr.install.dir:../../../..}/dist/" >> regex="solr-dataimporthandler-.*\.jar" /> >> <requestHandler name="/dataimport" class="solr.DataImportHandler"> >> <lst name="defaults"> >> <str name="config">data-config.xml</str> >> </lst> >> </requestHandler> >> >> I thought the schemaless mode was enable by default but I also tried adding >> this config but I get the same result. >> >> <schemaFactory class="ManagedIndexSchemaFactory"> >> <bool name="mutable">true</bool> >> <str name="managedSchemaResourceName">managed-schema</str> >> </schemaFactory> >> >> How can I update my schemaless URP chain and add the parameter to call it to >> DIH? >> >> >>> On 10 Aug 2016, at 17:43, Alexandre Rafalovitch <arafa...@gmail.com> wrote: >>> >>> Do you have the actual fields defined? If not, then I am guessing that >>> your 'post' test was against a different collection that had >>> schemaless mode enabled and your DIH one is against one where >>> schemaless mode is not enabled (look for >>> 'add-unknown-fields-to-the-schema' in the solrconfig.xml to confirm). >>> Solr examples for DIH do not have schemaless mode enabled. >>> >>> I _believe_ you can copy the schemaless URP chain and add the >>> parameter to call it to DIH handler and it _should_ work. But I am not >>> betting on it without testing it, as DIH also has some magic code to >>> ignore fields not defined in schema because it is designed to work >>> with only extracting relevant fields from the database even with >>> 'select *' statement. >>> >>> >>> Regards, >>> Alex. >>> ---- >>> Newsletter and resources for Solr beginners and intermediates: >>> http://www.solr-start.com/ >>> >>> >>> On 10 August 2016 at 17:12, Pierre Caserta <pierre.case...@gmail.com> wrote: >>>> Hi, >>>> It seems that using the DataImportHandler with a XPathEntityProcessor >>>> config >>>> with a managed-schema setup, only import the id and version field. >>>> >>>> data-config.xml >>>> >>>> <dataConfig> >>>> <dataSource type="FileDataSource" encoding="UTF-8" /> >>>> <document> >>>> <entity name="post" >>>> processor="XPathEntityProcessor" >>>> stream="true" >>>> forEach="/posts/row/" >>>> url="${dataimporter.request.dataurl}" >>>> >>>> transformer="RegexTransformer,DateFormatTransformer,HTMLStripTransformer" >>>>> >>>> <field column="id" xpath="/posts/row/@Id" /> >>>> <field column="postTypeId" xpath="/posts/row/@PostTypeId" /> >>>> <field column="acceptedAnswerId" >>>> xpath="/posts/row/@AcceptedAnswerId" /> >>>> <field column="creationDate" xpath="/posts/row/@CreationDate" >>>> dateTimeFormat="yyyy-MM-dd'T'hh:mm:ss.SSS" /> >>>> <field column="postScore" xpath="/posts/row/@Score" /> >>>> <field column="viewCount" xpath="/posts/row/@ViewCount" /> >>>> <field column="body" xpath="/posts/row/@Body" stripHTML="true" >>>> /> >>>> <field column="ownerUserId" xpath="/posts/row/@OwnerUserId" /> >>>> <field column="lastEditorUserId" >>>> xpath="/posts/row/@LastEditorUserId" /> >>>> <field column="lastEditorDisplayName" >>>> xpath="/posts/row/@LastEditorDisplayName" /> >>>> <field column="lastActivityDate" >>>> xpath="/posts/row/@LastActivityDate" >>>> dateTimeFormat="yyyy-MM-dd'T'hh:mm:ss.SSS" /> >>>> <field column="title" xpath="/posts/row/@Title" /> >>>> <field column="trimmedTags" xpath="/posts/row/@Tags" >>>> regex="<(.*)>" /> >>>> <field column="tags" sourceColName="trimmedTags" >>>> splitBy="><" /> >>>> <field column="answerCount" xpath="/posts/row/@AnswerCount" /> >>>> <field column="commentCount" xpath="/posts/row/@CommentCount" >>>> /> >>>> <field column="favoriteCount" xpath="/posts/row/@FavoriteCount" >>>> /> >>>> <field column="communityOwnedDate" >>>> xpath="/posts/row/@CommunityOwnedDate" >>>> dateTimeFormat="yyyy-MM-dd'T'hh:mm:ss.SSS" /> >>>> </entity> >>>> </document> >>>> </dataConfig> >>>> >>>> >>>> http://192.168.99.100:8999/solr/solrexchange/select?indent=on&q=*:*&wt=json >>>> { >>>> "responseHeader":{ >>>> "status":0, >>>> "QTime":0, >>>> "params":{ >>>> "q":"*:*", >>>> "indent":"on", >>>> "wt":"json", >>>> "_":"1470811193595"}}, >>>> "response":{"numFound":8,"start":0,"docs":[ >>>> { >>>> "id":"38822", >>>> "_version_":1542258196375142400}, >>>> { >>>> "id":"38836", >>>> "_version_":1542258196387725312}, >>>> { >>>> "id":"63896", >>>> "_version_":1542258196388773888}, >>>> { >>>> "id":"65406", >>>> "_version_":1542258196391919616}, >>>> { >>>> "id":"1357173", >>>> "_version_":1542258196391919617}, >>>> { >>>> "id":"5339763", >>>> "_version_":1542258196392968192}, >>>> { >>>> "id":"9932722", >>>> "_version_":1542258196392968193}, >>>> { >>>> "id":"9217299", >>>> "_version_":1542258196392968194}] >>>> }} >>>> >>>> data_search.xml (8 rows) >>>> >>>> >>>> >>>> the url I am hitting (with custom dataurl parameter) >>>> >>>> curl >>>> 'http://192.168.99.100:8999/solr/solrexchange/dataimport?command=full-import&commit=true&dataurl=/code/solr/data/search/dih/data_search.xml' >>>> >>>> I changed my data to use <add> <doc> <field> and use the bin/post tool and >>>> this is working as expected. >>>> Now I am interested to make it work with the DataImportHandler. >>>> How can I use the DataImportHandler to import my document ? >>>> >>>> Thanks, >>>> Pierre Caserta >>>> >>>> >>