Re: dovecot full text search

2019-12-15 Thread Joan Moreau
Hi 


The first run of indexing on a large existing mailbox is indeed slow,
and I would run "doveadm index -A -q \*" before putting the system in
production. 

Besides the Ram disk, what kind of solution would you suggest ? 


On 2019-12-10 19:28, Wojciech Puchar via dovecot wrote:


Where do write ops take place?


to the xapian index  subdirectory


Maybe mount that path to a RAM disk rather than looking for anorher solution.

not a solution for a problem but workaround

Am 10.12.2019 um 15:50 schrieb Wojciech Puchar via dovecot 
:

what FTP module should i use instead of squat that is probably no longer 
supported or no longer at all?

i want to upgrade my dovecot installation. it currently uses squat but i found 
it often crashes on FTS on large mailboxes.

i found "xapian" addon for dovecot but while it works excellent AFTER database 
is created, i found it need like 20 or so minutes to index less than 10GB of mails and 
while doing this - generate many tens of megabytes/s constant write traffic on it's 
database files.

Excellent way of killing SSD.

something must be broken.

my config is

plugin {
plugin = fts fts_xapian

fts = xapian
fts_xapian = partial=2 full=20 verbose=0

fts_autoindex = yes
fts_enforced = yes

#   fts_autoindex_exclude = \Junk
#   fts_autoindex_exclude2 = \Trash
}

any ideas?

Re: dovecot full text search

2019-12-10 Thread Wojciech Puchar via dovecot

Where do write ops take place?


to the xapian index  subdirectory


Maybe mount that path to a RAM disk rather than looking for anorher solution.


not a solution for a problem but workaround




Am 10.12.2019 um 15:50 schrieb Wojciech Puchar via dovecot 
:

what FTP module should i use instead of squat that is probably no longer 
supported or no longer at all?

i want to upgrade my dovecot installation. it currently uses squat but i found 
it often crashes on FTS on large mailboxes.

i found "xapian" addon for dovecot but while it works excellent AFTER database 
is created, i found it need like 20 or so minutes to index less than 10GB of mails and 
while doing this - generate many tens of megabytes/s constant write traffic on it's 
database files.

Excellent way of killing SSD.

something must be broken.

my config is

plugin {
   plugin = fts fts_xapian

   fts = xapian
   fts_xapian = partial=2 full=20 verbose=0

   fts_autoindex = yes
   fts_enforced = yes

#   fts_autoindex_exclude = \Junk
#   fts_autoindex_exclude2 = \Trash
}


any ideas?





Re: dovecot full text search

2019-12-10 Thread Admin via dovecot
Where do write ops take place?
Maybe mount that path to a RAM disk rather than looking for anorher solution.

> 
> Am 10.12.2019 um 15:50 schrieb Wojciech Puchar via dovecot 
> :
> 
> what FTP module should i use instead of squat that is probably no longer 
> supported or no longer at all?
> 
> i want to upgrade my dovecot installation. it currently uses squat but i 
> found it often crashes on FTS on large mailboxes.
> 
> i found "xapian" addon for dovecot but while it works excellent AFTER 
> database is created, i found it need like 20 or so minutes to index less than 
> 10GB of mails and while doing this - generate many tens of megabytes/s 
> constant write traffic on it's database files.
> 
> Excellent way of killing SSD.
> 
> something must be broken.
> 
> my config is
> 
> plugin {
>plugin = fts fts_xapian
> 
>fts = xapian
>fts_xapian = partial=2 full=20 verbose=0
> 
>fts_autoindex = yes
>fts_enforced = yes
> 
> #   fts_autoindex_exclude = \Junk
> #   fts_autoindex_exclude2 = \Trash
> }
> 
> 
> any ideas?



dovecot full text search

2019-12-10 Thread Wojciech Puchar via dovecot
what FTP module should i use instead of squat that is probably no longer 
supported or no longer at all?


i want to upgrade my dovecot installation. it currently uses squat but i 
found it often crashes on FTS on large mailboxes.


i found "xapian" addon for dovecot but while it works excellent AFTER 
database is created, i found it need like 20 or so minutes to index less 
than 10GB of mails and while doing this - generate many tens of 
megabytes/s constant write traffic on it's database files.


Excellent way of killing SSD.

something must be broken.

my config is

plugin {
plugin = fts fts_xapian

fts = xapian
fts_xapian = partial=2 full=20 verbose=0

fts_autoindex = yes
fts_enforced = yes

#   fts_autoindex_exclude = \Junk
#   fts_autoindex_exclude2 = \Trash
}


any ideas?


Re: Dovecot Full Text Search: HTTP 500 : Unknown fieldType 'text_general' specified on field text. [SERIOUS]

2015-03-05 Thread Muzaffer Tolga Ozses
Make that *text* instead of *text_general*

On 5 March 2015 at 12:14, Kevin Laurie  wrote:

> Hi Muzzafer,
> I get the error as specified below when i try to added it in as a field:-
> I dont think text_general is a valid field?
>
> HTTP ERROR 500
>
> Problem accessing /solr/. Reason:
>
> {msg=SolrCore 'collection1' is not available due to init failure:
> Could not load conf for core collection1: Unknown fieldType
> 'text_general' specified on field text. Schema file is
>
> /opt/solr/solr/collection1/conf/schema.xml,trace=org.apache.solr.common.SolrException:
> SolrCore 'collection1' is not available due to init failure: Could not
> load conf for core collection1: Unknown fieldType 'text_general'
> specified on field text. Schema file is
> /opt/solr/solr/collection1/conf/schema.xml
> at org.apache.solr.core.CoreContainer.getCore(CoreContainer.java:745)
> at
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:307)
> at
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:207)
> at
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1419)
> at
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:455)
> at
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137)
> at
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:557)
> at
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231)
> at
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1075)
> at
> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:384)
> at
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193)
> at
> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1009)
> at
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135)
> at
> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255)
> at
> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:154)
> at
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)
> at org.eclipse.jetty.server.Server.handle(Server.java:368)
> at
> org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:489)
> at
> org.eclipse.jetty.server.BlockingHttpConnection.handleRequest(BlockingHttpConnection.java:53)
> at
> org.eclipse.jetty.server.AbstractHttpConnection.headerComplete(AbstractHttpConnection.java:942)
> at
> org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete(AbstractHttpConnection.java:1004)
> at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:640)
> at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:235)
> at
> org.eclipse.jetty.server.BlockingHttpConnection.handle(BlockingHttpConnection.java:72)
> at
> org.eclipse.jetty.server.bio.SocketConnector$ConnectorEndPoint.run(SocketConnector.java:264)
> at
> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608)
> at
> org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: org.apache.solr.common.SolrException: Could not load conf
> for core collection1: Unknown fieldType 'text_general' specified on
> field text. Schema file is /opt/solr/solr/collection1/conf/schema.xml
> at
> org.apache.solr.core.ConfigSetService.getConfig(ConfigSetService.java:66)
> at org.apache.solr.core.CoreContainer.create(CoreContainer.java:489)
> at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:255)
> at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:249)
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> ... 1 more
> Caused by: org.apache.solr.common.SolrException: Unknown fieldType
> 'text_general' specified on field text. Schema file is
> /opt/solr/solr/collection1/conf/schema.xml
> at org.apache.solr.schema.IndexSchema.readSchema(IndexSchema.java:595)
> at org.apache.solr.schema.IndexSchema.(IndexSchema.java:166)
> at
> org.apache.solr.schema.IndexSchemaFactory.create(IndexSchemaFactory.java:55)
> at
> org.apache.solr.schema.IndexSchemaFactory.buildIndexSchema(IndexSchemaFactory.java:69)
> at
> org.apache.solr.core.ConfigSetService.createIndexSchema(ConfigSetService.java:90)
> at
> org.apache.solr.core.ConfigSetService.getConfig(ConfigSetService.java:62)
> ... 7 more
> Caused by: org.apache.solr.common.SolrException: Unknown fieldType
> 'text_general' specified on field text
> at org.apache.solr.schema.IndexSchema.loadFields(IndexSchema.java:638)
> at org.apache.solr.schema.IndexSchema.readSchema(IndexSchema.java:489)
> ... 12 more
> ,code=500}
>
>
> On Thu, Mar 5, 2015 at 5:04 PM, Muzaffer Tol

Re: Dovecot Full Text Search: HTTP 500 : Unknown fieldType 'text_general' specified on field text. [SERIOUS]

2015-03-05 Thread Kevin Laurie
Anyone here can enlighten me on this?

On Thu, Mar 5, 2015 at 5:14 PM, Kevin Laurie
 wrote:
> Hi Muzzafer,
> I get the error as specified below when i try to added it in as a field:-
> I dont think text_general is a valid field?
>
> HTTP ERROR 500
>
> Problem accessing /solr/. Reason:
>
> {msg=SolrCore 'collection1' is not available due to init failure:
> Could not load conf for core collection1: Unknown fieldType
> 'text_general' specified on field text. Schema file is
> /opt/solr/solr/collection1/conf/schema.xml,trace=org.apache.solr.common.SolrException:
> SolrCore 'collection1' is not available due to init failure: Could not
> load conf for core collection1: Unknown fieldType 'text_general'
> specified on field text. Schema file is
> /opt/solr/solr/collection1/conf/schema.xml
> at org.apache.solr.core.CoreContainer.getCore(CoreContainer.java:745)
> at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:307)
> at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:207)
> at 
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1419)
> at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:455)
> at 
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137)
> at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:557)
> at 
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231)
> at 
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1075)
> at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:384)
> at 
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193)
> at 
> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1009)
> at 
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135)
> at 
> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255)
> at 
> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:154)
> at 
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)
> at org.eclipse.jetty.server.Server.handle(Server.java:368)
> at 
> org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:489)
> at 
> org.eclipse.jetty.server.BlockingHttpConnection.handleRequest(BlockingHttpConnection.java:53)
> at 
> org.eclipse.jetty.server.AbstractHttpConnection.headerComplete(AbstractHttpConnection.java:942)
> at 
> org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete(AbstractHttpConnection.java:1004)
> at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:640)
> at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:235)
> at 
> org.eclipse.jetty.server.BlockingHttpConnection.handle(BlockingHttpConnection.java:72)
> at 
> org.eclipse.jetty.server.bio.SocketConnector$ConnectorEndPoint.run(SocketConnector.java:264)
> at 
> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608)
> at 
> org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: org.apache.solr.common.SolrException: Could not load conf
> for core collection1: Unknown fieldType 'text_general' specified on
> field text. Schema file is /opt/solr/solr/collection1/conf/schema.xml
> at org.apache.solr.core.ConfigSetService.getConfig(ConfigSetService.java:66)
> at org.apache.solr.core.CoreContainer.create(CoreContainer.java:489)
> at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:255)
> at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:249)
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> ... 1 more
> Caused by: org.apache.solr.common.SolrException: Unknown fieldType
> 'text_general' specified on field text. Schema file is
> /opt/solr/solr/collection1/conf/schema.xml
> at org.apache.solr.schema.IndexSchema.readSchema(IndexSchema.java:595)
> at org.apache.solr.schema.IndexSchema.(IndexSchema.java:166)
> at 
> org.apache.solr.schema.IndexSchemaFactory.create(IndexSchemaFactory.java:55)
> at 
> org.apache.solr.schema.IndexSchemaFactory.buildIndexSchema(IndexSchemaFactory.java:69)
> at 
> org.apache.solr.core.ConfigSetService.createIndexSchema(ConfigSetService.java:90)
> at org.apache.solr.core.ConfigSetService.getConfig(ConfigSetService.java:62)
> ... 7 more
> Caused by: org.apache.solr.common.SolrException: Unknown fieldType
> 'text_general' specified on field text
> at org.apache.solr.schema.IndexSchema.loadFields(IndexSchema.java:638)
> at org.apache.solr.schema.IndexSchema.readSchema(IndexSchema.java:489)
> ... 12 more
> ,code=500}
>
>
> On Thu, Mar 5, 2015 at 5:04 PM,

Dovecot Full Text Search: HTTP 500 : Unknown fieldType 'text_general' specified on field text. [SERIOUS]

2015-03-05 Thread Kevin Laurie
Hi Muzzafer,
I get the error as specified below when i try to added it in as a field:-
I dont think text_general is a valid field?

HTTP ERROR 500

Problem accessing /solr/. Reason:

{msg=SolrCore 'collection1' is not available due to init failure:
Could not load conf for core collection1: Unknown fieldType
'text_general' specified on field text. Schema file is
/opt/solr/solr/collection1/conf/schema.xml,trace=org.apache.solr.common.SolrException:
SolrCore 'collection1' is not available due to init failure: Could not
load conf for core collection1: Unknown fieldType 'text_general'
specified on field text. Schema file is
/opt/solr/solr/collection1/conf/schema.xml
at org.apache.solr.core.CoreContainer.getCore(CoreContainer.java:745)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:307)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:207)
at 
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1419)
at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:455)
at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137)
at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:557)
at 
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231)
at 
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1075)
at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:384)
at 
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193)
at 
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1009)
at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135)
at 
org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255)
at 
org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:154)
at 
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)
at org.eclipse.jetty.server.Server.handle(Server.java:368)
at 
org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:489)
at 
org.eclipse.jetty.server.BlockingHttpConnection.handleRequest(BlockingHttpConnection.java:53)
at 
org.eclipse.jetty.server.AbstractHttpConnection.headerComplete(AbstractHttpConnection.java:942)
at 
org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete(AbstractHttpConnection.java:1004)
at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:640)
at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:235)
at 
org.eclipse.jetty.server.BlockingHttpConnection.handle(BlockingHttpConnection.java:72)
at 
org.eclipse.jetty.server.bio.SocketConnector$ConnectorEndPoint.run(SocketConnector.java:264)
at 
org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608)
at 
org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.solr.common.SolrException: Could not load conf
for core collection1: Unknown fieldType 'text_general' specified on
field text. Schema file is /opt/solr/solr/collection1/conf/schema.xml
at org.apache.solr.core.ConfigSetService.getConfig(ConfigSetService.java:66)
at org.apache.solr.core.CoreContainer.create(CoreContainer.java:489)
at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:255)
at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:249)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
... 1 more
Caused by: org.apache.solr.common.SolrException: Unknown fieldType
'text_general' specified on field text. Schema file is
/opt/solr/solr/collection1/conf/schema.xml
at org.apache.solr.schema.IndexSchema.readSchema(IndexSchema.java:595)
at org.apache.solr.schema.IndexSchema.(IndexSchema.java:166)
at org.apache.solr.schema.IndexSchemaFactory.create(IndexSchemaFactory.java:55)
at 
org.apache.solr.schema.IndexSchemaFactory.buildIndexSchema(IndexSchemaFactory.java:69)
at 
org.apache.solr.core.ConfigSetService.createIndexSchema(ConfigSetService.java:90)
at org.apache.solr.core.ConfigSetService.getConfig(ConfigSetService.java:62)
... 7 more
Caused by: org.apache.solr.common.SolrException: Unknown fieldType
'text_general' specified on field text
at org.apache.solr.schema.IndexSchema.loadFields(IndexSchema.java:638)
at org.apache.solr.schema.IndexSchema.readSchema(IndexSchema.java:489)
... 12 more
,code=500}


On Thu, Mar 5, 2015 at 5:04 PM, Muzaffer Tolga Ozses  wrote:
> Sure thing
>
> On 5 March 2015 at 11:52, Kevin Laurie  wrote:
>>
>> No i dont have it.
>> there is body field though. I think text is needed. Let me add it in and
>> see.
>> Thanks
>>
>>
>> On Thu, Mar 5, 2015 at 4:42 PM, Muzaffer Tolga Ozses 
>> wrote:
>>

Re: Dovecot Full Text Search results in SolrException: undefined field text [SERIOUS]

2015-03-05 Thread Kevin Laurie
Below is my schema.xml







  
  

  








  








  
  








  

 


 
   
   
   
   

   
   

   
   
   
   
   
 

 id
 body
 



On Thu, Mar 5, 2015 at 3:48 PM, Leon Kyneur  wrote:
> In your schema.XML check you have defined:
>
>  multiValued="true"/>
>
> On 05/03/2015 7:11 PM, "Kevin Laurie"  wrote:
>>
>> Hello,
>> My dovecot constantly runs into this error.
>> I want to fix this one last time, I am tired of troubleshooting so
>> please someone give me a lasting and proper solution for this error. I
>> think its a problem with the dovecot-solr module.
>>
>> Please tell me how do I find the root of this problem with Dovecot.
>> There is a problem with the body search text field. It always
>> fails(with no result), other searches work(ie. search date, subject
>> etc, ) The field-text I believe is missing. Please help. Desperate
>> here!
>>
>>
>>
>>
>> 2/25/2015, 11:32:30 PM ERROR SolrCore
>> org.apache.solr.common.
>> SolrException: undefined field text
>>
>> org.apache.solr.common.SolrException: undefined field text
>> at
>> org.apache.solr.schema.IndexSchema.getDynamicFieldType(IndexSchema.java:1269)
>> at
>> org.apache.solr.schema.IndexSchema$SolrQueryAnalyzer.getWrappedAnalyzer(IndexSchema.java:434)
>> at
>> org.apache.lucene.analysis.DelegatingAnalyzerWrapper$DelegatingReuseStrategy.getReusableComponents(DelegatingAnalyzerWrapper.java:74)
>> at org.apache.lucene.analysis.Analyzer.tokenStream(Analyzer.java:175)
>> at
>> org.apache.lucene.util.QueryBuilder.createFieldQuery(QueryBuilder.java:207)
>> at
>> org.apache.solr.parser.SolrQueryParserBase.newFieldQuery(SolrQueryParserBase.java:374)
>> at
>> org.apache.solr.parser.SolrQueryParserBase.getFieldQuery(SolrQueryParserBase.java:742)
>> at
>> org.apache.solr.parser.SolrQueryParserBase.handleBareTokenQuery(SolrQueryParserBase.java:541)
>> at org.apache.solr.parser.QueryParser.Term(QueryParser.java:299)
>> at org.apache.solr.parser.QueryParser.Clause(QueryParser.java:185)
>> at org.apache.solr.parser.QueryParser.Query(QueryParser.java:107)
>> at org.apache.solr.parser.QueryParser.TopLevelQuery(QueryParser.java:96)
>> at
>> org.apache.solr.parser.SolrQueryParserBase.parse(SolrQueryParserBase.java:151)
>> at org.apache.solr.search.LuceneQParser.parse(LuceneQParser.java:50)
>> at org.apache.solr.search.QParser.getQuery(QParser.java:141)
>> at
>> org.apache.solr.handler.component.QueryComponent.prepare(QueryComponent.java:148)
>> at
>> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:197)
>> at
>> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
>> at org.apache.solr.core.SolrCore.execute(SolrCore.java:1967)
>> at
>> org.apache.solr.core.QuerySenderListener.newSearcher(QuerySenderListener.java:64)
>> at org.apache.solr.core.SolrCore$5.call(SolrCore.java:1739)
>> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>> at
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>> at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>> at java.lang.Thread.run(Thread.java:745)


Re: Dovecot Full Text Search results in SolrException: undefined field text [SERIOUS]

2015-03-05 Thread Leon Kyneur
In your schema.XML check you have defined:


 On 05/03/2015 7:11 PM, "Kevin Laurie"  wrote:

> Hello,
> My dovecot constantly runs into this error.
> I want to fix this one last time, I am tired of troubleshooting so
> please someone give me a lasting and proper solution for this error. I
> think its a problem with the dovecot-solr module.
>
> Please tell me how do I find the root of this problem with Dovecot.
> There is a problem with the body search text field. It always
> fails(with no result), other searches work(ie. search date, subject
> etc, ) The field-text I believe is missing. Please help. Desperate
> here!
>
>
>
>
> 2/25/2015, 11:32:30 PM ERROR SolrCore
> org.apache.solr.common.
> SolrException: undefined field text
>
> org.apache.solr.common.SolrException: undefined field text
> at
> org.apache.solr.schema.IndexSchema.getDynamicFieldType(IndexSchema.java:1269)
> at
> org.apache.solr.schema.IndexSchema$SolrQueryAnalyzer.getWrappedAnalyzer(IndexSchema.java:434)
> at
> org.apache.lucene.analysis.DelegatingAnalyzerWrapper$DelegatingReuseStrategy.getReusableComponents(DelegatingAnalyzerWrapper.java:74)
> at org.apache.lucene.analysis.Analyzer.tokenStream(Analyzer.java:175)
> at
> org.apache.lucene.util.QueryBuilder.createFieldQuery(QueryBuilder.java:207)
> at
> org.apache.solr.parser.SolrQueryParserBase.newFieldQuery(SolrQueryParserBase.java:374)
> at
> org.apache.solr.parser.SolrQueryParserBase.getFieldQuery(SolrQueryParserBase.java:742)
> at
> org.apache.solr.parser.SolrQueryParserBase.handleBareTokenQuery(SolrQueryParserBase.java:541)
> at org.apache.solr.parser.QueryParser.Term(QueryParser.java:299)
> at org.apache.solr.parser.QueryParser.Clause(QueryParser.java:185)
> at org.apache.solr.parser.QueryParser.Query(QueryParser.java:107)
> at org.apache.solr.parser.QueryParser.TopLevelQuery(QueryParser.java:96)
> at
> org.apache.solr.parser.SolrQueryParserBase.parse(SolrQueryParserBase.java:151)
> at org.apache.solr.search.LuceneQParser.parse(LuceneQParser.java:50)
> at org.apache.solr.search.QParser.getQuery(QParser.java:141)
> at
> org.apache.solr.handler.component.QueryComponent.prepare(QueryComponent.java:148)
> at
> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:197)
> at
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
> at org.apache.solr.core.SolrCore.execute(SolrCore.java:1967)
> at
> org.apache.solr.core.QuerySenderListener.newSearcher(QuerySenderListener.java:64)
> at org.apache.solr.core.SolrCore$5.call(SolrCore.java:1739)
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
>


Re: Dovecot Full Text Search results in SolrException: undefined field text [SERIOUS]

2015-03-05 Thread Muzaffer Tolga Ozses
Paste your xml file here.

On 5 March 2015 at 10:11, Kevin Laurie  wrote:

> Hello,
> My dovecot constantly runs into this error.
> I want to fix this one last time, I am tired of troubleshooting so
> please someone give me a lasting and proper solution for this error. I
> think its a problem with the dovecot-solr module.
>
> Please tell me how do I find the root of this problem with Dovecot.
> There is a problem with the body search text field. It always
> fails(with no result), other searches work(ie. search date, subject
> etc, ) The field-text I believe is missing. Please help. Desperate
> here!
>
>
>
>
> 2/25/2015, 11:32:30 PM ERROR SolrCore
> org.apache.solr.common.
> SolrException: undefined field text
>
> org.apache.solr.common.SolrException: undefined field text
> at
> org.apache.solr.schema.IndexSchema.getDynamicFieldType(IndexSchema.java:1269)
> at
> org.apache.solr.schema.IndexSchema$SolrQueryAnalyzer.getWrappedAnalyzer(IndexSchema.java:434)
> at
> org.apache.lucene.analysis.DelegatingAnalyzerWrapper$DelegatingReuseStrategy.getReusableComponents(DelegatingAnalyzerWrapper.java:74)
> at org.apache.lucene.analysis.Analyzer.tokenStream(Analyzer.java:175)
> at
> org.apache.lucene.util.QueryBuilder.createFieldQuery(QueryBuilder.java:207)
> at
> org.apache.solr.parser.SolrQueryParserBase.newFieldQuery(SolrQueryParserBase.java:374)
> at
> org.apache.solr.parser.SolrQueryParserBase.getFieldQuery(SolrQueryParserBase.java:742)
> at
> org.apache.solr.parser.SolrQueryParserBase.handleBareTokenQuery(SolrQueryParserBase.java:541)
> at org.apache.solr.parser.QueryParser.Term(QueryParser.java:299)
> at org.apache.solr.parser.QueryParser.Clause(QueryParser.java:185)
> at org.apache.solr.parser.QueryParser.Query(QueryParser.java:107)
> at org.apache.solr.parser.QueryParser.TopLevelQuery(QueryParser.java:96)
> at
> org.apache.solr.parser.SolrQueryParserBase.parse(SolrQueryParserBase.java:151)
> at org.apache.solr.search.LuceneQParser.parse(LuceneQParser.java:50)
> at org.apache.solr.search.QParser.getQuery(QParser.java:141)
> at
> org.apache.solr.handler.component.QueryComponent.prepare(QueryComponent.java:148)
> at
> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:197)
> at
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
> at org.apache.solr.core.SolrCore.execute(SolrCore.java:1967)
> at
> org.apache.solr.core.QuerySenderListener.newSearcher(QuerySenderListener.java:64)
> at org.apache.solr.core.SolrCore$5.call(SolrCore.java:1739)
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
>



-- 
mto


Dovecot Full Text Search results in SolrException: undefined field text [SERIOUS]

2015-03-05 Thread Kevin Laurie
Hello,
My dovecot constantly runs into this error.
I want to fix this one last time, I am tired of troubleshooting so
please someone give me a lasting and proper solution for this error. I
think its a problem with the dovecot-solr module.

Please tell me how do I find the root of this problem with Dovecot.
There is a problem with the body search text field. It always
fails(with no result), other searches work(ie. search date, subject
etc, ) The field-text I believe is missing. Please help. Desperate
here!




2/25/2015, 11:32:30 PM ERROR SolrCore
org.apache.solr.common.
SolrException: undefined field text

org.apache.solr.common.SolrException: undefined field text
at org.apache.solr.schema.IndexSchema.getDynamicFieldType(IndexSchema.java:1269)
at 
org.apache.solr.schema.IndexSchema$SolrQueryAnalyzer.getWrappedAnalyzer(IndexSchema.java:434)
at 
org.apache.lucene.analysis.DelegatingAnalyzerWrapper$DelegatingReuseStrategy.getReusableComponents(DelegatingAnalyzerWrapper.java:74)
at org.apache.lucene.analysis.Analyzer.tokenStream(Analyzer.java:175)
at org.apache.lucene.util.QueryBuilder.createFieldQuery(QueryBuilder.java:207)
at 
org.apache.solr.parser.SolrQueryParserBase.newFieldQuery(SolrQueryParserBase.java:374)
at 
org.apache.solr.parser.SolrQueryParserBase.getFieldQuery(SolrQueryParserBase.java:742)
at 
org.apache.solr.parser.SolrQueryParserBase.handleBareTokenQuery(SolrQueryParserBase.java:541)
at org.apache.solr.parser.QueryParser.Term(QueryParser.java:299)
at org.apache.solr.parser.QueryParser.Clause(QueryParser.java:185)
at org.apache.solr.parser.QueryParser.Query(QueryParser.java:107)
at org.apache.solr.parser.QueryParser.TopLevelQuery(QueryParser.java:96)
at 
org.apache.solr.parser.SolrQueryParserBase.parse(SolrQueryParserBase.java:151)
at org.apache.solr.search.LuceneQParser.parse(LuceneQParser.java:50)
at org.apache.solr.search.QParser.getQuery(QParser.java:141)
at 
org.apache.solr.handler.component.QueryComponent.prepare(QueryComponent.java:148)
at 
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:197)
at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1967)
at 
org.apache.solr.core.QuerySenderListener.newSearcher(QuerySenderListener.java:64)
at org.apache.solr.core.SolrCore$5.call(SolrCore.java:1739)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)


Re: [Dovecot] Full text search improvements

2013-12-05 Thread Timo Sirainen
On 5.12.2013, at 10.40, Steffen Kaiser  wrote:

>> 9. Attachments can be translated to indexable UTF-8 text already with 
>> fts_decoder setting by doing it via a conversion script. This could also 
>> support Apache Tika server directly.
> 
> This means some kind of MIME type based (or file type guesser) "... to UTF8 
> text" converter script? Some users would find that very very very ^ n nice. 
> There are already several programs used in the field of CMS.

That’s already been possible since v2.1: 
http://hg.dovecot.org/dovecot-2.2/file/342f6962390e/src/plugins/fts/decode2text.sh



Re: [Dovecot] Full text search improvements

2013-12-05 Thread Steffen Kaiser

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On Sat, 30 Nov 2013, Timo Sirainen wrote:

7. Don't index non-text data? For example if there is large block of 
base64 data or something else that definitely doesn't look like text, 
it's pretty useless to index it. Then again, we do want to index all 
kinds of IDs that someone might want to search. This could be a bit 
difficult to implement well.


9. Attachments can be translated to indexable UTF-8 text already with 
fts_decoder setting by doing it via a conversion script. This could also 
support Apache Tika server directly.


This means some kind of MIME type based (or file type guesser) "... to 
UTF8 text" converter script? Some users would find that very very very ^ n 
nice. There are already several programs used in the field of CMS.


- -- 
Steffen Kaiser

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.11 (GNU/Linux)

iQEVAwUBUqA8A13r2wJMiz2NAQLQYwf/bAyrg080/i2khM/XGXLlhjlcPcyxGHym
KgoFFBhh2sgfl+ecRHCM4BP+WX/c5coxAScyXhSy9JjwcQz8MXUHzkbGL4d8kwa4
pgdhaD4hFhPqpOJGf1ULwBSIBEsJfZeHaOkJHlMqDgd3yKY5APoJPKJtG2z+lI+7
vqR/Pe8n8EhCcWcLC1CfEGKxcci09XYj09Sai96VGbCO2coVCm+xIKRSCW6pasoQ
NTqpJBTCe2gCD3KdVA5jUNqFeEj2AQF5+nkujtSF4B1G/xrpfoABLkJ+lyQ8F5hc
DTJFiHhlvJKRIIKbhuyQukeqDSzeln2UtSRce3q59fek4foFzDrhTw==
=l3mf
-END PGP SIGNATURE-


Re: [Dovecot] Full text search improvements

2013-12-04 Thread Michael M Slusarz

Quoting Timo Sirainen :

1. Support for multiple languages. Use textcat while indexing to  
guess the language of the indexed data.


FWIW, you could probably use the Content-Language header (if it  
exists) to at least give a hint.  No guarantee it is correct, but it's  
a better starting place than simply scanning all languages.


And, for that matter, you could leverage Accept-Language also (again,  
if it exists).  Which might be more useful, since it lists all the  
languages the user recognizes.


michael



Re: [Dovecot] Full text search improvements

2013-12-04 Thread Metro Domain Admin
Substring match is important to us, so we'd love to see Squat reinstated 
with speed improvements. It seems like Solr can handle substrings as 
well ([Edge]NGramFilterFactory), but for small deployments, having the 
engine built right in is a plus.




Re: [Dovecot] Full text search improvements

2013-12-02 Thread Timo Sirainen
On 3.12.2013, at 0.09, Mike Abbott  wrote:

>> Do you think [moving IMAP IDLE connections to a separate imap-idle process] 
>> would work for you also?
> 
> Probably.  It always depends on the details.  Forking a new imap process 
> every time there's a little input to read or output to send might perform 
> poorly under load.  Having a pool of ready imap processes could help that, 
> when the configuration permits (e.g. all mail owned by one uid).  It would be 
> interesting to compare client_limit > 1 vs. an idle connection aggregator.

I was thinking that you’d have a pool of imap processes waiting and being 
reused. Some state would be transferred between the imap-idle and imap 
processes. And it could work also for non-IDLEing idling connections. Then 
there needs to be some kind of a good balance of figuring out when to move 
connection to imap-idle to maximize the amount of time it’s there but also to 
minimize unnecessary CPU-wasting transfers.. Oh, and this would be possible 
also with multiple UIDs (although imap-idle might have to run as root then).

> What's so evil about client_limit > 1 besides requiring one uid, the indexer 
> polling I mentioned, and broken fcntl-style file locks?  Or is that enough?

Mainly that there are so many possible reasons for why imap process might 
block. It’s not possible to make all of them asynchronous. I guess getting rid 
of the longest waits could help, but I still wouldn’t dare to run that in 
production.



Re: [Dovecot] Full text search improvements

2013-12-02 Thread Mike Abbott
> Do you think [moving IMAP IDLE connections to a separate imap-idle process] 
> would work for you also?

Probably.  It always depends on the details.  Forking a new imap process every 
time there's a little input to read or output to send might perform poorly 
under load.  Having a pool of ready imap processes could help that, when the 
configuration permits (e.g. all mail owned by one uid).  It would be 
interesting to compare client_limit > 1 vs. an idle connection aggregator.

What's so evil about client_limit > 1 besides requiring one uid, the indexer 
polling I mentioned, and broken fcntl-style file locks?  Or is that enough?


Re: [Dovecot] Full text search improvements

2013-12-02 Thread Gedalya

On 12/02/2013 02:41 PM, Timo Sirainen wrote:

Currently I’m thinking that most of the reasons for client_limit>1 can be 
avoided just by moving IMAP IDLE connections to a separate imap-idle process where 
they wait until they have more work to do. Do you think that would work for you 
also?
I was exactly thinking about the same thing.. I wanted to request this 
feature but I guess I was too shy to write about it :D
I think a special IDLE process would be a wonderful idea. I find that 
otherwise client_limit>1 doesn't really work. It gets especially 
annoying when a client with a large mailbox makes a process grow and it 
doesn't shrink back, is there some insight about that? And, after 
service_count is maxed out, you end up having lots of processes waiting 
for the last 1 or 2 IDLEing clients to quit, so your total number of 
processes is really much larger than total connections / client_limit.


Re: [Dovecot] Full text search improvements

2013-12-02 Thread Timo Sirainen
On 2.12.2013, at 20.50, Mike Abbott  wrote:

>> how [FTS indexing] could be improved for everyone in future
> 
> For sites which set client_limit > 1 it would help performance not to stall 
> for INDEXER_WAIT_MSECS when polling the indexer for input.  Currently dovecot 
> unwinds back out to the main command loop repeatedly to allow other clients 
> to use the process but it also stalls the whole process for 
> INDEXER_WAIT_MSECS every time it finds no input from the indexer, which hurts 
> responsiveness for those other clients.  This can be avoided by removing the 
> client's I/O from the main ioloop and adding the indexer's instead, or 
> perhaps by leveraging CLIENT_COMMAND_STATE_WAIT_EXTERNAL.

Gets a bit tricky to implement, at least without changing the lib-storage API. 
I did have some plans for this earlier where lib-storage could call some 
callback when there is more data available for search/fetch/mailbox_open/etc 
functions. Currently I’m thinking that most of the reasons for client_limit>1 
can be avoided just by moving IMAP IDLE connections to a separate imap-idle 
process where they wait until they have more work to do. Do you think that 
would work for you also?



Re: [Dovecot] Full text search improvements

2013-12-02 Thread Mike Abbott
> how [FTS indexing] could be improved for everyone in future

For sites which set client_limit > 1 it would help performance not to stall for 
INDEXER_WAIT_MSECS when polling the indexer for input.  Currently dovecot 
unwinds back out to the main command loop repeatedly to allow other clients to 
use the process but it also stalls the whole process for INDEXER_WAIT_MSECS 
every time it finds no input from the indexer, which hurts responsiveness for 
those other clients.  This can be avoided by removing the client's I/O from the 
main ioloop and adding the indexer's instead, or perhaps by leveraging 
CLIENT_COMMAND_STATE_WAIT_EXTERNAL.

Third-party FTS implementations may benefit from having the NOT/AND/OR 
seq_range_array merging logic in squat_lookup_arg() generalized and made 
available to all.

It would also be helpful if FTS expunge were asynchronous, but this is not 
critical.



[Dovecot] Full text search improvements

2013-11-30 Thread Timo Sirainen
FTS indexing is something I hear quite often nowadays. I’ve added some hacks to 
make it work better for some installations, but it’s about time to think about 
the whole design and how it could be improved for everyone in future. Here are 
some of my initial thoughts.

Currently Dovecot supports 3 full text search engines: Solr, CLucene and 
Dovecot Squat. CLucene plugin has various features built in, which should have 
been built in a generic way to work with all the engines (although Solr has 
most of those already built-in). Squat was abandoned a few years ago in favor 
of Solr/CLucene, but perhaps it could be brought back to life, since it looks 
like its index sizes could be smaller than Lucene's.

Here's a list of things that should be added to generic Dovecot FTS code to 
improve all the backends:

1. Support for multiple languages. Use textcat while indexing to guess the 
language of the indexed data. (Perhaps run it separately for each paragraph to 
handle multi-language mails? Or at least many emails begin/end with different 
language than the text in the middle, e.g. "Foo Bar wrote:" is often in various 
languages.) Index the data using the detected language's stemming and other 
features. Keep track of which languages have been used in the index, and when 
searching stem the search words to all the used languages. Since each added 
language requires additional searches and there's the possibility of wrong 
detection, the list of allowed languages could be configurable. See also 
http://ntextcat.codeplex.com/ or at least change textcat to use UTF8.

2. Word stemming. This can be done for many languages with Snowball library. 
Solr has also implemented several other languages, perhaps its code can be 
somehow automatically translated to C(++) for use with Dovecot?

3. Don't index language-specific stopwords. We can get the word lists from e.g. 
Solr.

4. Try to detect compound words and index each part separately for languages 
that use them. http://wiki.apache.org/solr/LanguageAnalysis#Decompounding 
suggests two possible ways to do it.

5. Normalize words (e.g. drop diacritics). libicu can be used for this.

6. Drop (Unicode) characters that don't belong to the language? Or especially 
don't index most of the weird Unicode characters. This would avoid filling the 
index with unnecessary garbage.

7. Don't index non-text data? For example if there is large block of base64 
data or something else that definitely doesn't look like text, it's pretty 
useless to index it. Then again, we do want to index all kinds of IDs that 
someone might want to search. This could be a bit difficult to implement well.

8. Index attachments separately, so it would be possible to search only 
attachments. (Should "SEARCH BODY word1 BODY word2" return matches if word1 and 
word2 are in different attachments?)

9. Attachments can be translated to indexable UTF-8 text already with 
fts_decoder setting by doing it via a conversion script. This could also 
support Apache Tika server directly.

10. It should be configurable which fields are indexed. Body and header would 
always be separately indexed. Optionally there could be also at least: 
attachments, From, To, Cc, Bcc and Subject. The From/To/Cc/Bcc could also be 
indexed together in one "addresses" field. The more fields there are, the 
larger the index, but better/faster search results.

11. Each indexed mail should have metadata: Mailbox GUID, mail UID and the 
language the mail was indexed with. For attachments there should also be the 
MIME part number. When matching results, drop results if returned language 
doesn't match the query language.

Squat
-

Currently Squat index consists of a trie containing all the words and pointer 
to a file listing all the message UIDs that contain them. Each node in the trie 
has a pointer to the UIDs, so e.g. with "abc" the "a" node will contain UIDs of 
all mails that contain the "a" letter (e.g. 1,3-5,10). "ab" node will contain 
mails that have the "ab" substring. Since the "ab" is a subset of "a", the "ab" 
won't contain UIDs directly but instead it contains indexes to the "a" list to 
get a better compression (e.g. UID 3-5,10 -> 2-4 indexes in the "1,3-5,10" 
list). The "abc" node then similarly refers to the "ab" node's indexes.

It's configurable how long words Squat will index. Also substring matching is 
configurable. By default both are 4 letters, so words longer than 4 letters 
will be split to 4 letter pieces which are indexed (e.g. "dovecot" -> "dove", 
"ovec", "veco", "ecot"). When searching these pieces are looked up and the 
results are merged.

It's pretty pointless to do a search for 1-2 letter substrings. Most likely the 
user wants to find 1-2 letter word instead. Perhaps this is true also for 3 
letters? The Squat index could be changed to only add results for the first 1-2 
(or 1-3?) letters only for full words, not to word prefixes. This of course 
would mean that the "ab" referring to "a" UID list 

Re: [Dovecot] Full-text search

2013-02-22 Thread Timo Sirainen
On 18.2.2013, at 18.10, Valery V. Sedletski  wrote:

> I discovered that the full-text search (fts) plugin can work without
> SQUAT/LUCENE/SOLR backend. I.e., Dovecot creates separate indexes for
> header search in files dovecot.index and dovecot.index.cache. Even, the
> search by headers is fast enough, and can search for phrases. Also, it
> seems that this buillt-in search is faster than Solr-based search.

Well, that depends on the mailbox size. The built-in search probably does more 
work than Solr, but the latency is better I guess.

> But if I
> enable the Solr backend (fts_solr), then the FTS generic plugin built-in
> search becomes disabled.
> But, Solr-based Full-text search is faster if search inside message bodies.
> Also, it appears that the built-in search inside message bodies works too,
> but very slow (it seems that is because it is dumb file-based search and
> does not use indexes at all)
> So, my question is: is it possible to combine the built-in search with Solr
> or Lucene plugin-based search so that the first one searches by headers,
> and the second one works by bodies?
> This could make the advantages of both search methods combined.

If you unconditionally want to remove it, that's easy. Just have 
fts_header_want_indexed() in fts-api.c always return FALSE. 

But there are also advantages to searching all headers through Solr, even if 
it's slower, because it can do inexact matching. For example "query" can match 
"queries" and so on.



[Dovecot] Full-text search

2013-02-18 Thread Valery V. Sedletski
Hi all
I discovered that the full-text search (fts) plugin can work without
SQUAT/LUCENE/SOLR backend. I.e., Dovecot creates separate indexes for
header search in files dovecot.index and dovecot.index.cache. Even, the
search by headers is fast enough, and can search for phrases. Also, it
seems that this buillt-in search is faster than Solr-based search. But if I
enable the Solr backend (fts_solr), then the FTS generic plugin built-in
search becomes disabled.
But, Solr-based Full-text search is faster if search inside message bodies.
Also, it appears that the built-in search inside message bodies works too,
but very slow (it seems that is because it is dumb file-based search and
does not use indexes at all)
So, my question is: is it possible to combine the built-in search with Solr
or Lucene plugin-based search so that the first one searches by headers,
and the second one works by bodies?
This could make the advantages of both search methods combined.
WBR,
valery




Re: [Dovecot] Full text search in attachments

2012-08-19 Thread Timo Sirainen
On 19.8.2012, at 21.57, Mailing wrote:

> Am 19.08.2012 17:26, schrieb Timo Sirainen:
>> On 11.8.2012, at 5.28, Mailing wrote:
>> I updated the wiki with:
>> 
>> * See the decode2text.sh script included in Dovecot for how to use this.
> 
> I want to send the complete attachments (unparsed) to solr server and let 
> solr do the parsing work. Is it maybe possible to use a decode2text like 
> script together with curl to send the attachments to the sold server?
> 
> But in this case I would have to know additional informations in the script 
> like message id and the mailbox name.

It can't work like that, because all the text from one message (text & all 
attachments) has to go to one document.



Re: [Dovecot] Full text search in attachments

2012-08-19 Thread Mailing

Hi Timo,

Am 19.08.2012 17:26, schrieb Timo Sirainen:

On 11.8.2012, at 5.28, Mailing wrote:
I updated the wiki with:

 * See the decode2text.sh script included in Dovecot for how to use 
this.


I want to send the complete attachments (unparsed) to solr server and 
let solr do the parsing work. Is it maybe possible to use a decode2text 
like script together with curl to send the attachments to the sold 
server?


But in this case I would have to know additional informations in the 
script like message id and the mailbox name.


Best regards,

Sebastian



Re: [Dovecot] Full text search in attachments

2012-08-19 Thread Timo Sirainen
On 11.8.2012, at 5.28, Mailing wrote:

> is it possible to use the Solr full text search plugin for indexing mail 
> attachments? I found a very old patch and some hints regarding a fts_decoder 
> script that I don't understand.
> 
> Making Solr indexing PDF or Office files shouldn't be that difficult, but how 
> can I enable the plugin to transfer the attachments to Solr?


I updated the wiki with: 

 * See the decode2text.sh script included in Dovecot for how to use this.



[Dovecot] Full text search in attachments

2012-08-10 Thread Mailing

Hello,

is it possible to use the Solr full text search plugin for indexing 
mail attachments? I found a very old patch and some hints regarding a 
fts_decoder script that I don't understand.


Making Solr indexing PDF or Office files shouldn't be that difficult, 
but how can I enable the plugin to transfer the attachments to Solr?



Best regards,

Sebastian