Re: ExtractRequestHandler with url instead of path to file

2019-06-12 Thread marotosg
Found the issue. Was using the wrong parameter. stream.url instead of stream.file http://solrhost:8983/solr/document/update/extract?=true=http://serverwith -- Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html

ExtractRequestHandler with url instead of path to file

2019-06-12 Thread marotosg
Hi, I would like to make a request to Solr to index documents hosted as urls. This works when I send a path to the file but seems to fail when sending an url. Sample request http://solrhost:8983/solr/document/update/extract?=true=http://serverwith docs:8080/Box_Sync.log Do you have any idea

Solr times out connecting to Zookeeper

2019-01-23 Thread marotosg
Hi, I am having some trouble trying to increase Zookeeper client timeout when Solr tries to connect to Zookeeper when starts the first time. Tried already with no luck updating the following properties : 1) Update the zkClientTimeout on solr.xml ${zkClientTimeout:9} 2) Uncomment

Best practice to deploy Solr to production

2019-01-22 Thread marotosg
Hi all, I have a Solr index which has been evolving since Solr1.4 and now is in SolrCloud6.6. This cluster is composed of 4 servers, few collections and shards. Since first time I deployed to production in 2009 I am using the same approach to deploy. I think it's probably the time to review and

ExtractRequestHandler and Tika. Get only plain text

2018-11-14 Thread marotosg
Hi all, Currently I am trying to do index documents from different kinds with Solr and tika. It's working fine but when solr returns the content of the document. Doesn't return the plain text. It comes back as well with some metadata. For instance my request.

Indexing documents from S3 bucket

2018-10-08 Thread marotosg
Hi, At the moment I have a SolrCloud Cluster with a documents collection being populated indexing documents coming from a DFS server. Linux boxes are mounting that DFS server using samba. There is a request to move that DFS server to a AWS S3 bucket. Does anyone have previous experience about

Query with exact number of tokens

2018-09-21 Thread marotosg
Hi, I have to search for company names where my first requirement is to find only exact matches on the company name. For instance if I search for "CENTURY BANCORP, INC." I shouldn't find "NEW CENTURY BANCORP, INC." because the result company has the extra keyword "NEW". I can't use exact match

Java 9 and Solr 6.6

2017-12-01 Thread marotosg
HI all. Would you recommend installing Solr 6.6.1 with Java 9 for a production environement? Thanks, Sergio -- Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html

Re: Strip out punctuation at the end of token

2017-11-24 Thread marotosg
is a test. Test. Shawn Heisey-2 wrote > On 11/23/2017 8:06 AM, marotosg wrote: >> I am trying to strip out any "." at the end of a token but I would like >> to >> keep the original token as well. >> This is my index analyzer >> > >

Strip out punctuation at the end of token

2017-11-23 Thread marotosg
Hi all, I am trying to strip out any "." at the end of a token but I would like to keep the original token as well. This is my index analyzer i was thinking of using the solr.PatternReplaceFilterFactory but i see this one won't keep the original

Restore fails - File missing

2017-07-31 Thread marotosg
Hi, I am trying to do a backup and restore of 1 collection on SolrCloud version 6.1.0. I tried a few times and no issues but suddenly after indexing the collection from scratch and doing a backup. I got an issue on the restore. After the restore,the collection fails complaining about 1 file

Re: HTTP ERROR 504 - Optimize

2017-07-31 Thread marotosg
Basically an issue with loadbalancer timeout. -- View this message in context: http://lucene.472066.n3.nabble.com/HTTP-ERROR-504-Optimize-tp4345815p4348330.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: Boost by Integer value on top of query

2017-07-31 Thread marotosg
Thanks a lot for the answer. I finally achieve this using boost and scale function on top of my query https://wiki.apache.org/solr/FunctionQuery#scale Thanks to scale no matter how big are the values or small on my People and Assignment Columns I can range them to a value between 1 and 2.

Boost by Integer value on top of query

2017-07-20 Thread marotosg
Hi, I have a use where I need to boost documents based on two integer values. Basically I need to retrieve companies using specific criteria like Company name, nationality etc. On top of that query I need to boost the most important ones which are suppose to be the ones with higher number of

HTTP ERROR 504 - Optimize

2017-07-13 Thread marotosg
Hi, I am getting HTTP ERROR 504 when doing an optimization operation. Not always, it's pretty random. Trying to figure out if Solr, jetty or either the browser breaks the connection after specific time. I have successful optimizations which take aprox 10 minutes. I had a look to Solr Source code

Re: _version_ / Versioning using timespan

2017-06-01 Thread marotosg
Thanks a lot Susheel. I see this is actually what I need. I have been testing it and notice the value of the field has to be always greater for a new document to get indexed. if you send the same version number it doesn't work. Is it possible somehow to overwrite documents with the same

_version_ / Versioning using timespan

2017-05-31 Thread marotosg
Hi all. I need to implement an indexation solution where my Solr index doesn't get a wrong version. Due to the fact I have many version for the same entity In some cases my client may end up indexing an earlier version of my entity after a newer one. I was wondering if I can use the _version_

Restore SolrCloud collection using core.properties

2017-05-11 Thread marotosg
Hi, I have a SolrCloud collection I create using a list of properties within a core.properties file. When I create the collection I call the collection API passing the core.properties using the "property.properties=/localpath/core.properties":

Re: Upload core.properties to ZooKeeper

2017-05-11 Thread marotosg
You can load a core.properties using "=/localpath/core.properties" For instance http://solrserver:8983/solr/admin/collections?action=CREATE=person=1=2&=person=/localpath/core.properties -- View this message in context:

Tagging Locations using SynonymGraphFilterFactory

2017-04-08 Thread marotosg
Hi, I am trying to extract locations from a location field which contains location information in different formats. My initial idea is to extract only UK and USA location and get them Standard. For instance if my field contains "Wakefield" then I will convert it to "Wakefield West Yorkshire".

DataImportHandler OutOfMemory Mysql

2017-04-01 Thread marotosg
Hi, I am trying to load a big table into Solr using DataImportHandler and Mysql. I am getting OutOfMemory error because Solr is trying to load the full table. I have been reading different posts and tried batchSize="-1". https://wiki.apache.org/solr/DataImportHandlerFaq Do you have any idea

Classify document using bag of words

2017-03-26 Thread marotosg
Hi, I have a very simple use case where I would need to classify a document using a bag of words. Basically if a field within the document contains any of the words on my bag then I use a new field to assign a category to the document. Is this something achievable on Solr? I was thinking on

Fetch Data from two collections and filter results

2017-03-15 Thread marotosg
Hi, I am trying to solve a use case and not fully sure if can be achieved with new fancy Solr features. I have two collections People and Jobs. Job collection has the id of the person in the schema. 1) Is it possible to run a query against person collection and join data from the job

After migrating to SolrCloud

2017-01-26 Thread marotosg
Hi All, I have migrated Solr from older versio 3.6 to SolrCloud 6.2 and all good but there are almost every second some WARN messages in the logs. HttpParser bad HTTP parsed: 400 HTTP/0.9 not supported for HttpChannelOverHttp@16a84451{r=0,c=false,a=IDLE,uri=null} Anynone knows where are these

RTF Rich text format

2016-11-14 Thread marotosg
Hi, I have a use case where I need to index information coming from a database where there is a field which contains rich text format. I would like to convert that text into simple plain text, same as tika does when indexing documents. Is there any way to achive that having a field only where i

Re: sorting by date not working on dates earlier than EPOCH

2016-11-14 Thread marotosg
Hi there. I have found a possible solution for this issue. -- View this message in context: http://lucene.472066.n3.nabble.com/sorting-by-date-not-working-on-dates-earlier-than-EPOCH-tp4303456p4305770.html Sent from the Solr - User mailing list archive at Nabble.com.

sorting by date not working on dates earlier than EPOCH

2016-10-28 Thread marotosg
Hi all, I just noticed than sorting on dates is not working as I am expecting. When I sort ascending by a field of type date. I get first dates earlier than EPOCH, then null and last dates later than EPOCH. sample response: { numFound: 2052, start: 0, docs: [ { ClosedDateSFD:

Re: Query by distance

2016-10-18 Thread marotosg
This is my field type. I was reading about this and it looks like the issue I have been reading and it looks like the issue is about multi term synonym.

Query by distance

2016-10-11 Thread marotosg
Hi, I have a field which contains Job Positions for people. This field uses a SynonymFilterFactory The field contains the following data "Chief Sales Officer" and my synonyms file has an entrance like "Chief Sales Officer, Chief of Sales, Chief Sales Executive". My Analyzer return for "Chief

Indexing (x,y) points representing characteristics

2016-08-23 Thread marotosg
Hi. I have a use case I am trying to solve and stuck with some ideas. I would need to index one field in my collection with x,y values which represents how a person is located on an axis based on some characteristics of him. x and y go from 0 to 1 in 0.1 gaps. For instance a person can have

Re: Using log4j.xml in Solr6

2016-07-26 Thread marotosg
After a bit of testing I got it working. Basically all the configuration for log4j by default is under server/resources/log4j.properites. By default log4j should be able to find log4j.xml if you delete log4j.properties. I tried it and that's no the case. I figured out this is due the fact that

Using log4j.xml in Solr6

2016-07-25 Thread marotosg
Hi all, I am trying to upgrade Solr4.11 to Solr6 and having some trouble with logging. I have Solr4.11 running on tomcat 6 as a solr.war. Inside my solr.war a few jar files updated as it explains in this post so I can use a "log4j.xml" with some advanced features to compress old files.

Query exact match with ASCIIFoldingFilterFactory

2016-06-08 Thread marotosg
Hi all, I am trying to query and match on a collection of documents with a field which is basically text coming from pdfs. It could contain any type of text. field type It works well in

Re: Highlighting phone numbers

2016-05-19 Thread marotosg
Thanks. Using the debug query returns the info I need. -- View this message in context: http://lucene.472066.n3.nabble.com/Highlighting-phone-numbers-tp4277491p4277712.html Sent from the Solr - User mailing list archive at Nabble.com.

Highlighting phone numbers

2016-05-18 Thread marotosg
Hi, I have a solr multivalued field with a list of phone numbers with many different formats. Below field type.

Amazon CloudSearch

2016-04-26 Thread marotosg
Hi, I am evaluating the possibility of using Amazon CloudSearch to manage Solr insances. Reason is the price and time to manage and deploy. I am not fully sure yet how flexible is that service. in case you need to install a specific solr version or plug in. Do you have any experience with it?

Re: join and NOT together

2016-02-16 Thread marotosg
Actually I was wrong this doesn't work. (-DocType:pdf) -- View this message in context: http://lucene.472066.n3.nabble.com/join-and-NOT-together-tp4257411p4257620.html Sent from the Solr - User mailing list archive at Nabble.com.

join and NOT together

2016-02-15 Thread marotosg
Hi, I am trying to solve an issue when doing a search joining two collections and negating the cross core query. Let's say I have one collection person and another collection documents and I can join them using local param !join because I have PersonIDS in document collection. if my query is

Count multivalued field issue

2016-01-06 Thread marotosg
Hi, I am trying to add a new field to my schema to add the number of items of a multivalued field. I am using solr 4.11 These are my fields on *schema.xml* Here is the update done to my *solrconfig.xml*. I created an updateRequestProcessorChain and add it to the update handler

Regression tests and evaluate quality of results

2015-09-30 Thread marotosg
Hi, I have some doubts about how to define a process to evaluate the quality of search results. I have a solr collection with 4M documents with information about people. I search across several fields like first name ,second name, email, address, phone etc. There is plenty of logic in the

Upload core.properties to ZooKeeper

2015-08-06 Thread marotosg
Hi, I am in the process of migrating my master, slave Solr infraestructure to SolrCloud. At the moment I have several cores inside a folder with this structure /MyCores /MyCores/Core1 /MyCores/Core1/conf /MyCores/Core1/core.properties /MyCores/Core2 /MyCores/Core2/conf

Analytics on Solr logs

2015-07-17 Thread marotosg
Hi, i have a use case where We would like to know what are the users searching for. Most commonly used criteria etc. One requirement is related to the user who is searching. We need to know who is making each search but this is not criteria itself. It is just analysis information. I was

Re: Solr Exception The remote server returned an error: (400) Bad Request.

2015-05-05 Thread marotosg
Thanks for the answer but i don't think that's going to solve my problem.For instance if I copy this query in the chrome browserhttp://localhost:8080/solr48/person/select?q=CoreD:25I get this error.4001CoreD:25undefined field CoreD400If I use wget from linux wget

Solr Exception The remote server returned an error: (400) Bad Request.

2015-05-05 Thread marotosg
Hi, I am having some difficulties knowing which one is the exception I am having on my client for some queries. Queries malformed are always coming back to my solrNet client as The remote server returned an error: (400) Bad Request.. Internally Solr is actually printing the log issues like

Validate data Indexed and versioning

2015-03-02 Thread marotosg
Hi, I am trying to define a way of validating if my index has the same content than my database. I am indexing a very complex denormalized version of the database with many items and nested documents. I have an indexation service which pulls records from a staging table(created based on a ETL

Re: Block join subqueries

2014-12-18 Thread marotosg
To be honest. I don´t have a clue how the syntax would be. I tried something like {!type=join from=PersonIdsS to=PersonID fromIndex=assignment}({!type=join from=CompanyID to=CompIDS fromIndex=company v='NationalitySFD:Canada'}) AND type_level:parent but this is two joins from Person to Company

Block join subqueries

2014-12-17 Thread marotosg
Hi, Is is possible to do a query joining three levels. For isntance with three cores Person, Person Job and Company. I know is possible to join from Person to Person Job and from Person to Company. For instance {!type=join from=PersonIdsS to=PersonID fromIndex=personjob}type_level:parent AND

Re: Block join subqueries

2014-12-17 Thread marotosg
Hi Mikhail, Thanks for that. That's exactly what I was looking for but this is for the same core. This allows you to search in a document nested two levels. I was expecting to do the same for cross core joins. That's basically doing a join from Core1 to Core2 to Core3. I couldn't find anything

Re: Block join subqueries

2014-12-17 Thread marotosg
Yes, that's true. I mean join then. is it possible to join three cores A B C. I know it is possible to join A - B and A - C Is it possible to join them A - B - C Thanks -- View this message in context: http://lucene.472066.n3.nabble.com/Block-join-subqueries-tp4174709p4174749.html Sent from

Re: ExtractingRequestHandler indexing zip files

2014-09-09 Thread marotosg
hi keeblerh, Patch has to be applied to the source code and compile again Solr.war. If you do that then it works extracting the content of documents Regards, Sergio -- View this message in context:

Re: ExtractingRequestHandler indexing zip files

2014-05-28 Thread marotosg
I extended ExtractingDocumentLoader with this patch and it works. https://issues.apache.org/jira/secure/attachment/12473188/SOLR-2416_ExtractingDocumentLoader.patch Iterates throw all documents and extracts the name and the content of all documents inside the file. Regards, Sergio -- View

Re: ExtractingRequestHandler indexing zip files

2014-05-27 Thread marotosg
Hi, Thanks for your answer Alexandre. I have zip files with only one document inside per zip file. These documents are mainly pdf,xml,html. I tried to index tini.txt.gz file which is located in the trunk to be used by extraction tests

ExtractingRequestHandler indexing zip files

2014-05-26 Thread marotosg
Hi, I am using ExtractingRequestHandler to be able to index different type of documents (doc,pdf,txt,html) but when I try to index compressed files like zip files solr returns the name of the file inside the field which I am using to map the content. Any idea is this is actually working? I

Return Solr docs in a specific order by list of ids

2014-04-02 Thread marotosg
Hi, I have a use case where I have a list of doc ids and I need to return Documents from solr in the same order as my list of ids. For instance: 459,185,569,8,1,896 Is it possible to return docs is Solr following in the same order? Regards, Sergio -- View this message in context:

Re: Return Solr docs in a specific order by list of ids

2014-04-02 Thread marotosg
I found an easy solution which is using the boosting (PersonID:459)^0.6 OR (PersonID:185)^0.5 OR (PersonID:569)^0.4 OR (PersonID:8)^0.3 OR (PersonID:1)^0.2 OR (PersonID:896) ^0.1 -- View this message in context:

Re: Global User defined properties - solr.xml from Solr 4.4 to Solr 4.5

2013-10-28 Thread marotosg
Done https://issues.apache.org/jira/browse/SOLR-5398 -- View this message in context: http://lucene.472066.n3.nabble.com/Global-User-defined-properties-solr-xml-from-Solr-4-4-to-Solr-4-5-tp4097740p4098143.html Sent from the Solr - User mailing list archive at Nabble.com.

Global User defined properties - solr.xml from Solr 4.4 to Solr 4.5

2013-10-25 Thread marotosg
Hi, I am migrating Solr 4.4 to Solr 4.5 and I have an issue in Solr.xml. I my old Solr.xml I had some properties I am reusing for all my cores. Furthemore I have some properties related to each individual core. solr property name=lucene.version value=LUCENE_40/ property name=store.fields

Re: Global User defined properties - solr.xml from Solr 4.4 to Solr 4.5

2013-10-25 Thread marotosg
Hi Erik. thanks for your help. I tried with solr.xml as follow solr property name=lucene.version value=LUCENE_40/ /solr It fails with this exception Caused by: org.apache.solr.common.SolrException: No system property or default value specified for lucene.version value:${lucene.version}

Re: Global User defined properties - solr.xml from Solr 4.4 to Solr 4.5

2013-10-25 Thread marotosg
Right, but what if you have many properties being shared across multiple cores. That means you have to copy same properties in each individual core.properties. Is not this redundant data. My main problem is I would like to keep several properties at solr level not to core level. Thanka a lot

Re: Complex query combining fq and q with join

2013-09-24 Thread marotosg
I found the solution. http://dzoessolr020:8080/solr4/person/select/? q= ( ( ( GenderSFD:Male ) AND {!join from=PersonID to=CoreID fromIndex=personjob v='((CoCompanyName:hospital) OR (PoPositionsAllS:developer))'} AND {!join from=DocPersonAttachS to=CoreID fromIndex=document v='(DocNameS:

Complez query combining fq and q with join

2013-09-23 Thread marotosg
Hi all,Thanks in advance for your help.I am trying to create a query joining two cores using {!join} functionality.I have two cores, personcore and personjobcore. *Person core schema*PersonIDGenderAge*Company core schema*PersonJobIDPersonIDCompanyNameCompanyTypeAddressI have to create a complex

Sort by currency field using OpenExchangeRatesOrgProvider

2013-02-28 Thread marotosg
Hi, I have as part of my schema one currency field. fieldType name=currency class=solr.CurrencyField precisionStep=8 providerClass=solr.OpenExchangeRatesOrgProvider refreshInterval=300 defaultCurrency=USD ratesFileLocation=http://myurwithjsonfile/ IT works properly when i filter or do a query

ExtractingRequestHandler literals

2013-02-08 Thread marotosg
Hi, I am trying to index some documents using ExtractingRequestHandler and tika. Solr 3.6 I would like to add some extra data coming from a different source using literal. My schema contains these fields field name=DocumentID type=string indexed=true stored=true required=true/ field

Restore hot backup

2013-01-09 Thread marotosg
Hi, Is possible to restore an old backup without shutting down Solr? Regards, Sergio -- View this message in context: http://lucene.472066.n3.nabble.com/Restore-hot-backup-tp4031866.html Sent from the Solr - User mailing list archive at Nabble.com.

Query excluding empty values and some criteria

2012-10-18 Thread marotosg
Hi. I am trying to do query where I need to include empty values and exclude some specific data. For instance my field name is Industry and my query looks like (-Industry:Agriculture) OR (-Industry:[* TO *]) I want to get all empty values OR industries which are not Agriculture. This query does

Suggester component replication

2012-06-22 Thread marotosg
Hi. I have a Solr master and slave servers. After replicating from master to slave the suggester in slave does not have any data. Do you know if replication is possible for a suggester component? Thanks Sergio -- View this message in context:

Re: Searching partial phone numbers

2012-01-20 Thread marotosg
Hi. I found the solutions for that. You can apply a new filter for that field. It´s possible to define a type text field with a new filter *filter class=solr.ReversedWildcardFilterFactory withOriginal=true maxPosAsterisk=3 maxPosQuestion=2 maxFractionAsterisk=0.33/* That means you will generate

Searching partial phone numbers

2012-01-19 Thread marotosg
Hi. I have phone numbers in my solr schema in a field. At the moment i have this field as string. I would like to be able to make searches that find parts of a phone number. For instance: Number +35384589458 search by *+35384* or search by *84589*. Do you know if this is posible? Thanks

PositionIncrementGap inside a field

2012-01-17 Thread marotosg
Hi. At the moment I have a multivalued field where i would like to add information with gaps at the end of every line in the multivalued field and I would like to add gaps as well in the middle of the lines. For instance IBM Corporation some information *here a gap* more

PositionIncrementGap inside a field

2012-01-17 Thread marotosg
Hi. At the moment I have a multivalued field where i would like to add information with gaps at the end of every line in the multivalued field and I would like to add gaps as well in the middle of the lines. For instance field name=CompaniesData type=text indexed=true stored=true

Re: PositionIncrementGap inside a field

2012-01-17 Thread marotosg
Hi Erick. Thanks for your asnwer. This is almost what i want to do but my problem is that i want to be able to introduce two different sizes of gaps. Something like arr name=CompaniesData str IBM Corporation some information *gap of 30* more information *gap of 100* /str str

Use solr to search in a document repository

2011-12-06 Thread marotosg
Hi. I'm just thinking in the option of using solr to search in a huge document repository. My idea is reading documents(pdf,html,outlook,excel,doc,openoffice,powerpoint...) and extract the information from them and index it in Solr. Basically i'm looking for a solution to search in my documents.

Search by range in multivalued fields

2011-08-16 Thread marotosg
Hi. I have a solr core with job records and one guy can work in different companies in a specific range of dateini to dateend. doc arr name=companyinimultivaluefield companyiniIBM10012005companyini companyiniAPPLE10012005companyini /arr arr