solr-injection

2020-02-11 Thread Martin Frank Hansen (MHQ)
Hi, I was wondering how others are handling solr – injection in their solutions? After reading this post: https://www.waratek.com/apache-solr-injection-vulnerability-customer-alert/ I can see how important it is to update to Solr-8.2 or higher. Has anyone been successful in injecting unintende

RE: highlighting not working as expected

2019-07-01 Thread Martin Frank Hansen (MHQ)
"string", it will require exact match, including space and upper/lower case. You can use the type "text" for a start, but further down the road it will be good to have your own custom fieldType with your own tokenizer and filter. Regards, Edwin On Tue, 25 Jun 2019 at 14:52,

RE: highlighting not working as expected

2019-06-24 Thread Martin Frank Hansen (MHQ)
6.2019 um 10:06 schrieb Martin Frank Hansen (MHQ) : > > Hi, > > I am having some difficulties making highlighting work. For some reason the > highlighting feature only works on some fields but not on other fields even > though these fields are stored. > > An example of a re

RE: highlighting not working as expected

2019-06-16 Thread Martin Frank Hansen (MHQ)
using for the field “Sagstitel”? Is it the same as other fields? Regards, Edwin On Mon, 3 Jun 2019 at 16:06, Martin Frank Hansen (MHQ) wrote: > Hi, > > I am having some difficulties making highlighting work. For some > reason the highlighting feature only works on some fields but no

RE: highlighting not working as expected

2019-06-16 Thread Martin Frank Hansen (MHQ)
documents? > Am 03.06.2019 um 10:06 schrieb Martin Frank Hansen (MHQ) : > > Hi, > > I am having some difficulties making highlighting work. For some reason the > highlighting feature only works on some fields but not on other fields even > though these fields are stored

RE: highlighting not working as expected

2019-06-10 Thread Martin Frank Hansen (MHQ)
Please try hl.method=unified and tell us if that helps. ~ David Smiley Apache Lucene/Solr Search Developer http://www.linkedin.com/in/davidwsmiley On Mon, Jun 3, 2019 at 4:06 AM Martin Frank Hansen (MHQ) wrote: > Hi, > > I am having some difficulties making highlighting work. For some

highlighting not working as expected

2019-06-03 Thread Martin Frank Hansen (MHQ)
Hi, I am having some difficulties making highlighting work. For some reason the highlighting feature only works on some fields but not on other fields even though these fields are stored. An example of a request looks like this: http://localhost/solr/mytest/select?fl=id,doc.Type,Journalnummer,

RE: highlighter, stored documents and performance

2019-03-21 Thread Martin Frank Hansen (MHQ)
without highlighting. > Am 21.03.2019 um 17:05 schrieb Martin Frank Hansen (MHQ) : > > Hi, > > I am wondering how performance highlighting in Solr performs when the number > of documents get large? > > Right now we have about 1 TB of data in all sorts of file types an

highlighter, stored documents and performance

2019-03-21 Thread Martin Frank Hansen (MHQ)
Hi, I am wondering how performance highlighting in Solr performs when the number of documents get large? Right now we have about 1 TB of data in all sorts of file types and I was wondering how storing these documents within Solr (for highlighting purpose) will affect performance? Is it possib

RE: Update handler and atomic update

2019-03-19 Thread Martin Frank Hansen (MHQ)
;clicks":{“inc”:"1"}}] in the raw body hence using curl or any other app that allows you this like Postman. Best regards Thierry > On 19 Mar 2019, at 08:59, Martin Frank Hansen (MHQ) wrote: > > Hi Thierry, > > Do you mean something like this? > > http://loc

RE: Update handler and atomic update

2019-03-19 Thread Martin Frank Hansen (MHQ)
uot;docid","clicks":{“inc”:"1"}}] In an /update?commit=true Best regards Thierry See documentation here https://lucene.apache.org/solr/guide/6_6/updating-parts-of-documents.html > On 19 Mar 2019, at 08:14, Martin Frank Hansen (MHQ) wrote: > > Hi, > > Ho

Update handler and atomic update

2019-03-19 Thread Martin Frank Hansen (MHQ)
Hi, Hope someone can help me, I am trying to make an incremental update for one document using the API, but cannot make it work. I have tried a lot of things and all I actually want is to increment the value of the field “clicks” by one. I have something like this: http://localhost:8983/solr/..

RE: MLT and facetting

2019-03-01 Thread Martin Frank Hansen (MHQ)
put when combine together, as it is not working > currently and there is no way to test. > > Regards, > Edwin > > On Thu, 28 Feb 2019 at 14:51, Martin Frank Hansen (MHQ) wrote: > >> Hi Edwin, >> >> Ok that is nice to know. Do you know when this bug will g

RE: MLT and facetting

2019-03-01 Thread Martin Frank Hansen (MHQ)
cording to the number of > occurrences. But I'm not sure how it will affect the MLT score or how > it will be output when combine together, as it is not working > currently and there is no way to test. > > Regards, > Edwin > >> On Thu, 28 Feb 2019 at 14:51, Martin Frank

RE: MLT and facetting

2019-03-01 Thread Martin Frank Hansen (MHQ)
test. Regards, Edwin On Thu, 28 Feb 2019 at 14:51, Martin Frank Hansen (MHQ) wrote: > Hi Edwin, > > Ok that is nice to know. Do you know when this bug will get fixed? > > By ordering I mean that MLT score the documents according to its > similarity function (believe it is co

RE: MLT and facetting

2019-02-27 Thread Martin Frank Hansen (MHQ)
before, so I'm not sure how it works. For the ordering of the documents, do you mean to sort them according to the criteria that you want? Regards, Edwin On Wed, 27 Feb 2019 at 14:43, Martin Frank Hansen (MHQ) wrote: > Hi Edwin, > > Thanks for your response. Are you sure it is a

RE: MLT and facetting

2019-02-26 Thread Martin Frank Hansen (MHQ)
same problem in Solr 7.7 if I turn on faceting in /mlt requestHandler. Found this issue in the JIRA: https://issues.apache.org/jira/browse/SOLR-7883 Seems like it is a bug in Solr and it has not been resolved yet. Regards, Edwin On Tue, 26 Feb 2019 at 21:03, Martin Frank Hansen (MHQ) wrote: >

RE: MLT and facetting

2019-02-26 Thread Martin Frank Hansen (MHQ)
solrconfig.xml? Regards, Edwin On Tue, 26 Feb 2019 at 14:43, Martin Frank Hansen (MHQ) wrote: > Hi Edwin, > > Thanks for your response. > > Yes you are right. It was simply the search parameters from Solr. > > The query looks like this: > > http:// > .../solr/.../mlt?df=

RE: MLT and facetting

2019-02-25 Thread Martin Frank Hansen (MHQ)
Sorry forgot to mention that we are using Solr 7.5. Internal - KMD A/S -Original Message- From: Martin Frank Hansen (MHQ) Sent: 26. februar 2019 07:43 To: solr-user@lucene.apache.org Subject: RE: MLT and facetting Hi Edwin, Thanks for your response. Yes you are right. It was

RE: MLT and facetting

2019-02-25 Thread Martin Frank Hansen (MHQ)
ink there are some pictures which are not being sent through in > the email. > > Do send your query that you are using, and which version of Solr you > are using? > > Regards, > Edwin > >> On Mon, 25 Feb 2019 at 20:54, Martin Frank Hansen (MHQ) wrote: >>

RE: MLT and facetting

2019-02-25 Thread Martin Frank Hansen (MHQ)
of Solr you are using? Regards, Edwin On Mon, 25 Feb 2019 at 20:54, Martin Frank Hansen (MHQ) wrote: > Hi, > > > > I am trying to combine the mlt functionality with facets, but Solr > throws > org.apache.solr.common.SolrException: ":"Unable to compute facet > ra

MLT and facetting

2019-02-25 Thread Martin Frank Hansen (MHQ)
Hi, I am trying to combine the mlt functionality with facets, but Solr throws org.apache.solr.common.SolrException: ":"Unable to compute facet ranges, facet context is not set". What I am trying to do is quite simple, find similar documents using mlt and group these using the facet parameter.

RE: unable to create new threads: out-of-memory issues

2019-02-12 Thread Martin Frank Hansen (MHQ)
ow. 3. SolrClient is definitely a subject for heavy reuse. On Tue, Feb 12, 2019 at 5:16 PM Martin Frank Hansen (MHQ) wrote: > Hi Mikhail, > > I am using Solrj but think I might have found the problem. > > I am doing a atomicUpdate on existing documents, and found out that I > crea

RE: unable to create new threads: out-of-memory issues

2019-02-12 Thread Martin Frank Hansen (MHQ)
did you get this error? Usually it occurs in custom code with many new Thread() calls and usually healed with thread poling. On Tue, Feb 12, 2019 at 3:25 PM Martin Frank Hansen (MHQ) wrote: > Hi, > > I am trying to create an index on a small Linux server running > Solr-7.5.0, but

unable to create new threads: out-of-memory issues

2019-02-12 Thread Martin Frank Hansen (MHQ)
Hi, I am trying to create an index on a small Linux server running Solr-7.5.0, but keep running into problems. When I try to index a file-folder of roughly 18 GB (18000 files) I get the following error from the server: java.lang.OutOfMemoryError: unable to create new native thread. >From the

RE: indexing multiple levels of data

2018-11-16 Thread Martin Frank Hansen (MHQ)
d the burden of building and running a separate app will probably be worth it. -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com > 16. nov. 2018 kl. 12:24 skrev Martin Frank Hansen (MHQ) : > > Hi, > > I am trying to add meta data and files to Solr, but a

indexing multiple levels of data

2018-11-16 Thread Martin Frank Hansen (MHQ)
Hi, I am trying to add meta data and files to Solr, but are experiencing some problems. Data is divided on three two, cases and files. For each case the meta-data is given in an xml document, while meta data for the files is given in another xml document, and the actual files are kept in yet a

RE: Merging data from different sources

2018-10-31 Thread Martin Frank Hansen (MHQ)
rging data from different sources > > Maybe > https://lucene.apache.org/solr/guide/7_5/update-request-processors.htm > l#atomicupdateprocessorfactory > > Regards, > Alex > > On Tue, Oct 30, 2018, 7:57 AM Martin Frank Hansen (MHQ), wrote: > > > Hi, > > > &g

RE: Merging data from different sources

2018-10-30 Thread Martin Frank Hansen (MHQ)
. oktober 2018 13:16 To: solr-user Subject: Re: Merging data from different sources Maybe https://lucene.apache.org/solr/guide/7_5/update-request-processors.html#atomicupdateprocessorfactory Regards, Alex On Tue, Oct 30, 2018, 7:57 AM Martin Frank Hansen (MHQ), wrote: > Hi, > > I am

Merging data from different sources

2018-10-30 Thread Martin Frank Hansen (MHQ)
Hi, I am trying to merge files from different sources and with different content (except for one key-field) , how can this be done in Solr? An example could be: Document 1 001 Unique id for Document 1 test-123 … Do

RE: Tesseract language

2018-10-28 Thread Martin Frank Hansen (MHQ)
gt; > On Sat, Oct 27, 2018 at 12:39 AM Martin Frank Hansen (MHQ) > > wrote: > > > Hi Rohan, > > > > Thanks for your reply, are you using tess4j with Tika or on its own? > > I will take a look at tess4j if I can't make it work with Tika alone. > > >

RE: Tesseract language

2018-10-27 Thread Martin Frank Hansen (MHQ)
n Fri, Oct 26, 2018 at 12:31 PM Martin Frank Hansen (MHQ) wrote: > Hi Tim, > > You were right. > > When I called `tesseract testing/eurotext.png testing/eurotext-dan -l > dan`, I got an error message so I downloaded "dan.traineddata" and > added it to the Tesserac

RE: Tesseract language

2018-10-26 Thread Martin Frank Hansen (MHQ)
`, Tika _should_ be able to specify "dan" with your code above. On Fri, Oct 26, 2018 at 10:49 AM Martin Frank Hansen (MHQ) wrote: > > Hi again, > > Now I moved the OCR part to Tika, but I still can't make it work with Danish. > It works when using default language settin

RE: Tesseract language

2018-10-26 Thread Martin Frank Hansen (MHQ)
tem.out.println(handler.toString()); } Hope that someone can help here. -Original Message----- From: Martin Frank Hansen (MHQ) Sent: 22. oktober 2018 07:58 To: solr-user@lucene.apache.org Subject: SV: Tesseract language Hi Erick, Thanks for the help! I will take a look at it. Martin Frank Hans

RE: Reading data using Tika to Solr

2018-10-26 Thread Martin Frank Hansen (MHQ)
attachment exceptions. On Fri, Oct 26, 2018 at 6:25 AM Martin Frank Hansen (MHQ) wrote: > Hi again, > > Never mind, I got manage to get the content of the msg-files as well > using the following link as inspiration: > https://wiki.apache.org/tika/RecursiveMetadata > > But thanks ag

RE: Reading data using Tika to Solr

2018-10-26 Thread Martin Frank Hansen (MHQ)
Hi again, Never mind, I got manage to get the content of the msg-files as well using the following link as inspiration: https://wiki.apache.org/tika/RecursiveMetadata But thanks again for all your help! -Original Message- From: Martin Frank Hansen (MHQ) Sent: 26. oktober 2018 10:14 To

RE: Reading data using Tika to Solr

2018-10-26 Thread Martin Frank Hansen (MHQ)
Solr If you’re processing actual msg (not eml), you’ll also need poi and poi-scratchpad and their dependencies, but then those msgs could have attachments, at which point, you may as just add tika-app. :D On Thu, Oct 25, 2018 at 2:46 PM Martin Frank Hansen (MHQ) wrote: > Hi Erick and

RE: Reading data using Tika to Solr

2018-10-25 Thread Martin Frank Hansen (MHQ)
#x27;m usually lazy and just execute it in > IntelliJ for development and have forgotten to set my classpath on > _numerous_ occasions when running it from a command line ;) > > Best, > Erick > > On Thu, Oct 25, 2018 at 2:55 AM Martin Frank Hansen (MHQ) > wrote: > > > > H

Reading data using Tika to Solr

2018-10-25 Thread Martin Frank Hansen (MHQ)
Hi, I am trying to read content of msg-files using Tika and index these in Solr, however I am having some problems with the OfficeParser(). I keep getting the error java.lang.NoClassDefFoundError for the OfficeParcer, even though both tika-core and tika-parsers are included in the build path.

SV: Tesseract language

2018-10-21 Thread Martin Frank Hansen (MHQ)
Hi Erick, Thanks for the help! I will take a look at it. Martin Frank Hansen, Senior Data Analytiker Data, IM & Analytics Lautrupparken 40-42, DK-2750 Ballerup E-mail m...@kmd.dk Web www.kmd.dk Mobil +4525571418 -Oprindelig meddelelse- Fra: Erick Erickson Sendt: 21. oktober

SV: Tesseract language

2018-10-21 Thread Martin Frank Hansen (MHQ)
Hi Gus, Thank you so much! I will definitely take a look at it during the day. Martin Frank Hansen, -Oprindelig meddelelse- Fra: Gus Heck Sendt: 22. oktober 2018 00:06 Til: solr-user@lucene.apache.org Emne: Re: Tesseract language Hi Martin, I wrote a framework (https://github.com

SV: Tesseract language

2018-10-21 Thread Martin Frank Hansen (MHQ)
Hi Alex, Thanks again for your reply, much appreciated. Martin Frank Hansen, Senior Data Analytiker Data, IM & Analytics Lautrupparken 40-42, DK-2750 Ballerup E-mail m...@kmd.dk Web www.kmd.dk Mobil +4525571418 -Oprindelig meddelelse- Fra: Alexandre Rafalovitch Sendt: 21. okt

SV: Tesseract language

2018-10-21 Thread Martin Frank Hansen (MHQ)
data to a Solr instance? Best regards Martin Frank Hansen -Oprindelig meddelelse- Fra: Alexandre Rafalovitch Sendt: 21. oktober 2018 16:26 Til: solr-user Emne: Re: Tesseract language There is a couple of things mixed in here: 1) Extract handler is not recommended for production usage

SV: Tesseract language

2018-10-21 Thread Martin Frank Hansen (MHQ)
Hi again, Is there anyone who has some experience of using Tesseract’s OCR module within Solr? The files I am trying to read into Solr is Danish Tiff documents. Martin Frank Hansen, Senior Data Analytiker Data, IM & Analytics [cid:image001.png@01D383C9.6C129A60] Lautrupparken 40-42, DK-

Tesseract language

2018-10-18 Thread Martin Frank Hansen (MHQ)
-handler to import single files into Solr, and it worked for a single file, how would I implement several files from a file-system? Here is the request-handler I used: false ignored_ true Martin Frank Hansen, Senior Data Analytiker Data, IM & Analy

SV: DIH for TikaEntityProcessor

2018-10-12 Thread Martin Frank Hansen (MHQ)
You sir just made my day!!! It worked!!! Thanks a million! Martin Frank Hansen, -Oprindelig meddelelse- Fra: Kamuela Lau Sendt: 12. oktober 2018 11:41 Til: solr-user@lucene.apache.org Emne: Re: DIH for TikaEntityProcessor Also, just wondering, have you have tried to specify

SV: DIH for TikaEntityProcessor

2018-10-12 Thread Martin Frank Hansen (MHQ)
Hi Kamuela, Thanks for your answer. I still get the same error, so I think I will try with the tech-products example to see if it works there as Alexendre suggest in the mail above. Martin Frank Hansen, -Oprindelig meddelelse- Fra: Kamuela Lau Sendt: 12. oktober 2018 11:38 Til: solr

SV: DIH for TikaEntityProcessor

2018-10-12 Thread Martin Frank Hansen (MHQ)
Hi again, Can anybody help me? Any suggestions to why I am getting the error below? Martin Frank Hansen, Senior Data Analytiker Data, IM & Analytics [cid:image001.png@01D383C9.6C129A60] Lautrupparken 40-42, DK-2750 Ballerup E-mail m...@kmd.dk<mailto:m...@kmd.dk> Web www.k

DIH for TikaEntityProcessor

2018-10-10 Thread Martin Frank Hansen (MHQ)
appreciated. Martin Frank Hansen Beskyttelse af dine personlige oplysninger er vigtig for os. Her finder du KMD’s Privatlivspolitik<http://www.kmd.dk/Privatlivspolitik>, der fortæller, hvordan vi behandler oplysninger om dig. Protection of your personal data is important to us. Here you ca

SV: DIH for different levels of XML

2018-10-07 Thread Martin Frank Hansen (MHQ)
Hi Alex, Thanks for your answer. I think I made it work. The problem was actually in the schema.xml, where the field "Journalnummer" should have multiValued="true". Martin Frank Hansen Lautrupparken 40-42, DK-2750 Ballerup E-mail m...@kmd.dk Web www.kmd.

DIH for different levels of XML

2018-10-07 Thread Martin Frank Hansen (MHQ)
Hi, I am having some difficulties adding data from different levels of a xml document. The xml can be as simple as this: 2165432 5 10 The data-config-file looks like this. The result is the following: { "respon

SV: data-import-handler for solr-7.5.0

2018-10-02 Thread Martin Frank Hansen (MHQ)
28548113 89 Now I guess I just have to add to this solution. Thanks for your help Alex, and also thanks to Jan answering the first mail. Best regards Martin Frank Hansen -Oprindelig meddelelse- Fra: Alexandre Rafalovitch Sendt: 2. oktober 2018 19:52 Til: solr-user Emne

SV: data-import-handler for solr-7.5.0

2018-10-02 Thread Martin Frank Hansen (MHQ)
Thanks for the info, the UI looks interesting... It does read the data-config correctly, so the problem is probably in this file. Martin Frank Hansen, Senior Data Analytiker Data, IM & Analytics Lautrupparken 40-42, DK-2750 Ballerup E-mail m...@kmd.dk Web www.kmd.dk Mobil +452557

SV: data-import-handler for solr-7.5.0

2018-10-02 Thread Martin Frank Hansen (MHQ)
;, "Total Documents Skipped":"0", "Full Dump Started":"2018-10-02 16:15:21", "":"Indexing completed. Added/Updated: 0 documents. Deleted 0 documents.", "Committed":"2018-10-02 16:15:22", "Time taken&quo

data-import-handler for solr-7.5.0

2018-10-02 Thread Martin Frank Hansen (MHQ)
adding the request-handler by adding the following lines: C:/Users/z6mhq/Desktop/nh/nh/conf/data-config.xml I am running a core residing in the folder “C:/Users/z6mhq/Desktop/nh/nh/conf” while the Solr installation is in “C:/Users/z6mhq/Documents/solr-7.5.0”. I really h

Re: Decompound German Words

2012-05-06 Thread Martin Frank
Dear Satish, did you found a decompounding dictionary for german? Best Regards Martin -- View this message in context: http://lucene.472066.n3.nabble.com/Decompound-German-Words-tp3708194p3966013.html Sent from the Solr - User mailing list archive at Nabble.com.