have the stopwordfilter?
On Mon, 07 Apr 2014 00:37:15 +0200, Furkan KAMACI furkankam...@gmail.com
wrote:
Correction: My patch is at SOLR-5152
7 Nis 2014 01:05 tarihinde Andreas Owen ao...@swissonline.ch yazdı:
i thought i cound use filter class=solr.LengthFilterFactory min=1
max=2/ to index
i have the a fieldtype that uses ngramfilter whle indexing. is there a
setting that can force the ngramfilter to index smaller words then the
minGramSize? Mine is set to 3 and the search wont find word that are only
1 or 2 chars long. i would like to not set minGramSize=1 because the
i thought i cound use filter class=solr.LengthFilterFactory min=1
max=2/ to index and search words that are only 1 or 2 chars long. it
seems to work but i have to test it some more
On Sun, 06 Apr 2014 22:24:20 +0200, Andreas Owen ao...@swissonline.ch
wrote:
i have the a fieldtype
i would like to call a url after the import is finished whith the event
document onImportEnd=. how can i do this?
when i select a facet in thema_f all the others in the group disapear
but the other facets keep the original findings. it seems like it should
work. maybe the underscore is the wrong char for the seperator?
example documents in index
doc
arr name=thema_f
str1_Produkte/str
i would like to call a url after the import is finished whith the event
document onImportEnd=. how can i do this?
, 2014 at 1:56 PM, Andreas Owen wrote:
i would like to call a url after the import is finished whith the event
document onImportEnd=. how can i do this?
--
Using Opera's mail client: http://www.opera.com/mail/
Is there a way to tell ngramfilterfactory while indexing that number shall
never be tokenized? then the query should be able to find numbers.
Or do i have to change the ngram-min for numbers (not alpha) to 1, if that is
possible? So to speak put the hole number as token and not all possible
Is there a way to tell ngramfilterfactory while indexing that number shall
never be tokenized? then the query should be able to find numbers.
Or do i have to change the ngram-min for numbers (not alpha) to 1, if that
is possible? So to speak put the hole number as token and not all possible
tell the query to search numbers differently woth WT, LCF or whatever?
I attached a doc with screenshots from solr analyzer
-Original Message-
From: Andreas Owen [mailto:a...@conx.ch]
Sent: Donnerstag, 13. März 2014 13:44
To: solr-user@lucene.apache.org
Subject: RE: Re[2]: NOT SOLVED
If I use the underscore in the query I don't get any results. If I remove
the underscore it finds the docs with underscore.
Can I tell solr to search through the ngtf instead of the wdf or is there
any better solution?
Query: yh_cug
I attached a doc with the analyzer output
I have given up this idee and made a wrapper which adds a fq with the userroles
to each request
-Original Message-
From: Andreas Owen [mailto:a...@conx.ch]
Sent: Dienstag, 11. März 2014 23:32
To: solr-user@lucene.apache.org
Subject: use local param in solrconfig fq for access-control
i
enablePositionIncrements=true/ !-- remove common words --
filter class=solr.GermanNormalizationFilterFactory/
filter class=solr.SnowballPorterFilterFactory
language=German/
/analyzer
/fieldType
-Original Message-
From: Andreas Owen [mailto:a...@conx.ch
gets generated as a separate token.
-- Jack Krupansky
-Original Message-
From: Andreas Owen
Sent: Tuesday, March 11, 2014 5:09 AM
To: solr-user@lucene.apache.org
Subject: RE: NOT SOLVED searches for single char tokens instead of from 3
uppwards
I got it roght the first time and here
.
Generally, using pure defaults for WDF is not what you want, especially for
query time. Usually there needs to be a slight asymmetry between index and
query for WDF - index generates more terms than query.
-- Jack Krupansky
-Original Message-
From: Andreas Owen
Sent
:[ TO *] +roles:($r))
(+organisations:($org) -roles:[ TO *])/str
/lst
-Original Message-
From: Andreas Owen [mailto:a...@conx.ch]
Sent: Mittwoch, 12. März 2014 14:44
To: solr-user@lucene.apache.org
Subject: Re[2]: NOT SOLVED searches for single char tokens instead of from 3
uppwards
yes
i have a field with the following type:
fieldType name=text_de class=solr.TextField positionIncrementGap=100
analyzer
tokenizer class=solr.StandardTokenizerFactory/
filter class=solr.LowerCaseFilterFactory/
filter class=solr.StopFilterFactory ignoreCase=true
sorry i looked at the wrong fieldtype
-Original-Nachricht-
Von: Andreas Owen a...@conx.ch
An: solr-user@lucene.apache.org
Datum: 11/03/2014 08:45
Betreff: searches for single char tokens instead of from 3 uppwards
i have a field with the following type:
fieldType name
I got it roght the first time and here is my requesthandler. The field
plain_text is searched correctly and has the sam fieldtype as title -
text_de
queryParser name=synonym_edismax
class=solr.SynonymExpandingExtendedDismaxQParserPlugin
lst name=synonymAnalyzers
lst
This works great but i would like to use lacal params r and org instead of
hard-coded
str name=fq (*:* -organisations:[* TO *] -roles:[* TO *])
(+organisations:(150 42) +roles:(174 72))
I would like
str name=fq (*:* -organisations:[* TO *] -roles:[* TO *])
Shouldn't the numbers be in the output below (parsed_filter_queries) and not
$r and $org?
This works great but i would like to use lacal params r and org instead
of hard-coded
str name=fq (*:* -organisations:[* TO *] -roles:[* TO
*]) (+organisations:(150 42) +roles:(174 72))
i would like to use $r and $org for access control. it has to allow the fq's
from my facet to work aswell. i'm not sure if i'm doing it wright or if i
should add it to a qf or the q itself. the debugquery returns a parsed fq
string and in them $r and $org are printed instead of their values.
does this maxClauseCount go over each field individually or all put together?
is it the date fields?
when i execute a query i get this error:
lst name=responseHeader int name=status500/int int
name=QTime93/int lst name=params str name=indenttrue/str
str name=qEin PDFchen als
i want to use the following in fq and i need to set the operator to OR. My q.op
is AND but I need OR in fq. I have read about ofq but that is for putting OR
between multiple fq. Can I set the operator for fq?
(-organisations:[ TO *] -roles:[ TO *]) (+organisations:(150 42)
+roles:(174
at 11:54 AM, Andreas Owen a...@conx.ch wrote:
I tried it in solr admin query and it showed me all the docs without a
value
in ogranisations and roles. It didn't matter if i used a base term, isn't
that give through the q-parameter?
-Original Message-
From: Raymond Wiker
: Andreas Owen [mailto:a...@conx.ch]
Sent: Montag, 17. Februar 2014 05:08
To: solr-user@lucene.apache.org
Subject: query parameters
in solrconfig of my solr 4.3 i have a userdefined requestHandler. i would like
to use fq to force the following conditions:
1: organisations is empty and roles
in your config file; i.e, something like
(*:* -organisations:[ TO *] -roles:[ TO *])
On Tue, Feb 18, 2014 at 12:16 PM, Andreas Owen a...@conx.ch wrote:
It seams that fq doesn't except OR because: (organisations:(150 OR 41)
AND
roles:(174)) OR (-organisations:[ TO *] AND -roles
in solrconfig of my solr 4.3 i have a userdefined requestHandler. i would like
to use fq to force the following conditions:
1: organisations is empty and roles is empty
2: organisations contains one of the commadelimited list in variable $org
3: roles contains one of the commadelimited
I'm using solr 4.3.1 and have installed it on a win 2008 server. Solr is
working, for example import search. But the admin guis right side isn't
loading and I get a javascript error for several d3-objects. The last error
is:
Load timeout for modules: lib/order!lib/jquery.autogrow
and you'll see exactly how docs are scored. Also,
it'll show you exactly how your query is parsed. Paste that if it's
confused, it'll help figure out what's going wrong.
On Tue, Dec 3, 2013 at 1:37 PM, Andreas Owen a...@conx.ch wrote:
So isn't it sorted automaticly by relevance (boost value
the question now is where -Infinity comes
from, this looks suspicious:
-Infinity = (MATCH) FunctionQuery(log(int(clicks))), product of:
-Infinity = log(int(clicks)=0)
not much help I know, but
Erick
On Wed, Dec 4, 2013 at 7:24 AM, Andreas Owen a...@conx.ch wrote:
Hi Erick
Here
When I search for agenda I get a lot of hits. Now if I update the 2.
Result by json-update the doc is moved to the end of the index when I search
for it again. The field I change is editorschoice and it never contains
the search term agenda so I dont see why it changes the order. Why does
field? And then make sure not to
change the timestamp when you do an update that you don't want to change the
order?
Apologies if I've misunderstood the situation.
On 12/3/13 1:00 PM, Andreas Owen wrote:
When I search for agenda I get a lot of hits. Now if I update the 2.
Result by json-update
I am querying test in solr 4.3.1 over the field below and it's not finding
all occurences. It seems that if it is a substring of a word like
Supertestplan it isn't found unless I use a wildcards *test*. This is
write because of my tokenizer but does someone know a way around this? I
don't want to
I suppose i have to create another field with diffenet tokenizers and set
the boost very low so it doesn't really mess with my ranking because there
the word is now in 2 fields. What kind of tokenizer can do the job?
From: Andreas Owen [mailto:a...@conx.ch]
Sent: Donnerstag, 21. November 2013
I solved it by adding a loop for years and one for quartals in which i count
the month-facets
-Original Message-
From: Andreas Owen [mailto:a...@conx.ch]
Sent: Montag, 11. November 2013 17:52
To: solr-user@lucene.apache.org
Subject: RE: date range tree
Has someone at least got a idee
Has someone at least got a idee how i could do a year/month-date-tree?
In Solr-Wiki it is mentioned that facet.date.gap=+1DAY,+2DAY,+3DAY,+10DAY
should create 4 buckets but it doesn't work
-Original Message-
From: Andreas Owen [mailto:a...@conx.ch]
Sent: Donnerstag, 7. November 2013
I have a multivalue field with links pointing to ids of solrdocuments. I
would like calculate how many links are pointing to each document und put
that number into the field links2me. How can I do this, I would prefer to do
it with a query and the updater so solr can do it internaly if possible?
I would like to make a facet on a date field with the following tree:
2013
4.Quartal
December
November
Oktober
3.Quartal
September
August
Juli
2.Quartal
June
Mai
April
1. Quartal
March
February
January
2012 .
Same as above
So far I have this in solrconfig.xml:
, Sep 29, 2013 at 9:47 AM, Andreas Owen a...@conx.ch wrote:
how dum can you get. obviously quite dum... i would have to analyze the
html-pages with a nested instance like this:
entity name=rec processor=XPathEntityProcessor
url=file:///C:\ColdFusion10\cfusion\solr\solr\tkbintranet
:
org.apache.solr.handler.dataimport.DataImportHandlerException:
java.lang.ClassCastException:
sun.net.www.protocol.http.HttpURLConnection$HttpInputStream cannot be cast to
java.io.Reader
On 28. Sep 2013, at 1:39 AM, Andreas Owen wrote:
ok i see what your getting at but why doesn't the following work:
field xpath=//h:h1
PM, Andreas Owen a...@conx.ch wrote:
entity name=tika processor=TikaEntityProcessor
url=${rec.urlParse} dataSource=dataUrl onError=skip format=html
field column=text/
entity name=detail type=XPathEntityProcessor
forEach=/html
FieldReaderDataSource?
Cheers,
Tricia
On Thu, Sep 26, 2013 at 4:17 AM, Andreas Owen a...@conx.ch wrote:
i'm using solr 4.3.1 and the dataimporter. i am trying to use
XPathEntityProcessor within the TikaEntityProcessor for indexing html-pages
but i'm getting this error for each document. i have also
(ThreadLeakControl.java:358)
at java.lang.Thread.run(Thread.java:722)
On Fri, Sep 27, 2013 at 3:55 AM, Andreas Owen a...@conx.ch wrote:
i removed the FieldReaderDataSource and dataSource=fld but it didn't
help. i get the following for each document:
DataImportHandlerException: Exception
i'm using solr 4.3.1 and the dataimporter. i am trying to use
XPathEntityProcessor within the TikaEntityProcessor for indexing html-pages but
i'm getting this error for each document. i have also tried
dataField=tika.text and dataField=text to no avail. the nested
XPathEntityProcessor detail
why does stripHTML=false have no effect in dih? the html is strippedin text
and text_nohtml when i do display the index with select?q=*
i'm trying to get a field without html and one with it so i can also index the
links on the page.
data-config.xml
entity name=rec
sorry, it works like this, i had a typo in my conf :-(
On 17. Sep 2013, at 2:44 PM, Andreas Owen wrote:
i would like to know how to get it to work and delete documents per xml and
dih.
On 17. Sep 2013, at 1:47 PM, Shalin Shekhar Mangar wrote:
What is your question?
On Tue, Sep 17
i would like to know how to get it to work and delete documents per xml and dih.
On 17. Sep 2013, at 1:47 PM, Shalin Shekhar Mangar wrote:
What is your question?
On Tue, Sep 17, 2013 at 12:17 AM, andreas owen a.o...@gmx.net wrote:
i am using dih and want to delete indexed documents by xml
i am using dih and want to delete indexed documents by xml-file with ids. i
have seen $deleteDocById used in entity query=...
data-config.xml:
entity name=rec processor=XPathEntityProcessor
url=file:///C:\ColdFusion10\cfusion\solr\solr\tkbintranet\docImportDelete.xml
forEach=/docs/doc
no jetty, and yes for tomcat i've seen a couple of answers
On 12. Sep 2013, at 3:12 AM, Otis Gospodnetic wrote:
Using tomcat by any chance? The ML archive has the solution. May be on
Wiki, too.
Otis
Solr ElasticSearch Support
http://sematext.com/
On Sep 11, 2013 8:56 AM, Andreas Owen
could it have something to do with the meta encoding tag is iso-8859-1 but the
http-header tag is utf8 and firefox inteprets it as utf8?
On 12. Sep 2013, at 8:36 AM, Andreas Owen wrote:
no jetty, and yes for tomcat i've seen a couple of answers
On 12. Sep 2013, at 3:12 AM, Otis Gospodnetic
it was the http-header, as soon as i force a iso-8859-1 header it worked
On 12. Sep 2013, at 9:44 AM, Andreas Owen wrote:
could it have something to do with the meta encoding tag is iso-8859-1 but
the http-header tag is utf8 and firefox inteprets it as utf8?
On 12. Sep 2013, at 8:36 AM
.
-- Jack Krupansky
-Original Message- From: Andreas Owen
Sent: Tuesday, September 10, 2013 7:07 AM
To: solr-user@lucene.apache.org
Subject: Re: charfilter doesn't do anything
ok i am getting there now but if there are newlines involved the regex stops
as soon as it reaches a \r\n
i'm using solr 4.3.1 with tika to index html-pages. the html files are
iso-8859-1 (ansi) encoded and the meta tag content-encoding as well. the
server-http-header says it's utf8 and firefox-webdeveloper agrees.
when i index a page with special chars like ä,ö,ü solr outputs it completly
wrote:
Use XML then. Although you will need to escape the XML special characters as
I did in the pattern.
The point is simply: Quickly and simply try to find the simple test scenario
that illustrates the problem.
-- Jack Krupansky
-Original Message- From: Andreas Owen
Sent
, HTML tag stripped
In your original query, you didn't show us what your default field, df
parameter, was.
-- Jack Krupansky
-Original Message- From: Andreas Owen
Sent: Sunday, September 08, 2013 5:21 AM
To: solr-user@lucene.apache.org
Subject: Re: charfilter doesn't do anything
the standard Solr simple post tool.
-- Jack Krupansky
-Original Message- From: Andreas Owen
Sent: Monday, September 09, 2013 6:40 PM
To: solr-user@lucene.apache.org
Subject: Re: charfilter doesn't do anything
i've downloaded curl and tried it in the comman prompt and power shell
? If not, please do so.
-- Jack Krupansky
-Original Message- From: Andreas Owen
Sent: Monday, September 09, 2013 4:42 PM
To: solr-user@lucene.apache.org
Subject: Re: charfilter doesn't do anything
i index html pages with a lot of lines and not just a string with the
body-tag
6, 2013 at 11:33 AM, Andreas Owen a...@conx.ch wrote:
ok i have html pages with html.!--body--content i
want!--/body--./html. i want to extract (index, store) only
that between the body-comments. i thought regexTransformer would be the
best because xpath doesn't work in tika and i
Sent: Thursday, September 05, 2013 2:41 PM
To: solr-user@lucene.apache.org
Subject: Re: charfilter doesn't do anything
On 9/5/2013 10:03 AM, Andreas Owen wrote:
i would like to filter / replace a word during indexing but it doesn't do
anything and i dont get a error.
in schema.xml i have
fields are being populated.
-- Jack Krupansky
-Original Message- From: Andreas Owen
Sent: Friday, September 06, 2013 4:01 AM
To: solr-user@lucene.apache.org
Subject: Re: charfilter doesn't do anything
the input string is a normal html page with the word Zahlungsverkehr in it
and my
have also found out is that the htmlparser from tika cuts my
body-comments out and tries to make well formed html, which i would like to
switch off.
On 6. Sep 2013, at 5:04 PM, Shawn Heisey wrote:
On 9/6/2013 7:09 AM, Andreas Owen wrote:
i've managed to get it working if i use
i would like to filter / replace a word during indexing but it doesn't do
anything and i dont get a error.
in schema.xml i have the following:
field name=text_html type=text_cutHtml indexed=true stored=true
multiValued=true/
fieldType name=text_cutHtml class=solr.TextField
analyzer
, Shalin Shekhar Mangar wrote:
I don't know much about Tika but in the example data-config.xml that
you posted, the xpath attribute on the field text won't work
because the xpath attribute is used only by a XPathEntityProcessor.
On Thu, Aug 29, 2013 at 10:20 PM, Andreas Owen a...@conx.ch wrote
if
TikaEntityProcessor supports such a thing.
On Wed, Sep 4, 2013 at 12:38 PM, Andreas Owen a...@conx.ch wrote:
so could i just nest it in a XPathEntityProcessor to filter the html or is
there something like xpath for tika?
entity name=htm processor=XPathEntityProcessor url=${rec.file
I want tika to only index the content in div id=content.../div for the
field text. unfortunately it's indexing the hole page. Can't xpath do this?
data-config.xml:
dataConfig
dataSource type=BinFileDataSource name=data/
dataSource type=BinURLDataSource name=dataUrl/
,
Alex
On 22 Aug 2013 13:34, Andreas Owen a...@conx.ch wrote:
i can do it like this but then the content isn't copied to text. it's just
in text_test
entity name=tika processor=TikaEntityProcessor
url=${rec.path}${rec.file} dataSource=dataUrl
field column=text name=text_test
2013 13:34, Andreas Owen a...@conx.ch wrote:
i can do it like this but then the content isn't copied to text. it's just
in text_test
entity name=tika processor=TikaEntityProcessor
url=${rec.path}${rec.file} dataSource=dataUrl
field column=text name=text_test
copyField source
i'm trying to index a html page and only user the div with the id=content.
unfortunately nothing is working within the tika-entity, only the standard text
(content) is populated.
do i have to use copyField for test_text to get the data?
or is there a problem with the
events from happening all at
once. Lately, it doesn't seem to be working. (Anonymous - via GTD book)
On Thu, Aug 22, 2013 at 11:02 AM, Andreas Owen a...@conx.ch wrote:
i'm trying to index a html page and only user the div with the
id=content. unfortunately nothing is working within the tika
, at 6:12 PM, Andreas Owen wrote:
i put it in the tika-entity as attribute, but it doesn't change anything. my
bigger concern is why text_test isn't populated at all
On 22. Aug 2013, at 5:27 PM, Alexandre Rafalovitch wrote:
Can you try SOLR-4530 switch:
https://issues.apache.org/jira
i have tried post.jar and it works when i set the literal.id in solrconfig.xml.
i can't pass the id with post.jar (-Dparams=literal.id=abc) because i get a
error: could not find or load main class .id=abc.
On 20. Jul 2013, at 7:05 PM, Andreas Owen wrote:
path was set text wasn't
from happening all at
once. Lately, it doesn't seem to be working. (Anonymous - via GTD book)
On Fri, Jul 19, 2013 at 12:09 PM, Andreas Owen a...@conx.ch wrote:
i'm using solr 4.3 which i just downloaded today and am using only jars
that came with it. i have enabled the dataimporter
to stored in the schema.xml?
On Sat, Jul 20, 2013 at 3:37 PM, Andreas Owen a...@conx.ch wrote:
they are in my schema, path is typed correctly the others are default
fields which already exist. all the other fields are populated and i can
search for them, just path and text aren't.
On 19
i'm using solr 4.3 which i just downloaded today and am using only jars that
came with it. i have enabled the dataimporter and it runs without error. but
the field path (included in schema.xml) and text (file content) aren't
indexed. what am i doing wrong?
solr-path:
newer jars mixed in with
the old ones.
-- Jack Krupansky
-Original Message- From: Andreas Owen
Sent: Sunday, July 14, 2013 3:07 PM
To: solr-user@lucene.apache.org
Subject: Re: solr autodetectparser tikaconfig dataimporter error
hi
is there nowone with a idea what this error
hi
is there nowone with a idea what this error is or even give me a pointer where
to look? If not is there a alternitave way to import documents from a xml-file
with meta-data and the filename to parse?
thanks for any help.
On 12. Jul 2013, at 10:38 PM, Andreas Owen wrote:
i am using solr
i am using solr 3.5, tika-app-1.4 and tagcloud 1.2.1. when i try to =
import a
file via xml i get this error, it doesn't matter what file format i try =
to index txt, cfm, pdf all the same error:
SEVERE: Exception while processing: rec document :
SolrInputDocument[{id=3Did(1.0)=3D{myTest.txt},
78 matches
Mail list logo