RE: Problem with manifold

2012-11-07 Thread Gonzalez, Pablo
-5-21-2039231098-2614715072-2050932820-1107
 
 allow_token_document:active_dir:S-1-1-0 
-deny_token_document:active_dir:S-1-1-0 
 allow_token_document:ad:S-1-5-32-545 -deny_token_document:ad:S-1-5-32-545  
 allow_token_document:ad:S-1-5-21-2039231098-2614715072-2050932820- 
-deny_token_document:ad:S-1-5-21-2039231098-2614715072-2050932820- 
 allow_token_document:ad:S-1-5-21-2039231098-2614715072-2050932820-513 
-deny_token_document:ad:S-1-5-21-2039231098-2614715072-2050932820-513 
 allow_token_document:ad:S-1-5-21-2039231098-2614715072-2050932820-1113 
-deny_token_document:ad:S-1-5-21-2039231098-2614715072-2050932820-1113 
 allow_token_document:ad:S-1-5-21-2039231098-2614715072-2050932820-1110 
-deny_token_document:ad:S-1-5-21-2039231098-2614715072-2050932820-1110 
 allow_token_document:ad:S-1-5-21-2039231098-2614715072-2050932820-1107 
-deny_token_document:ad:S-1-5-21-2039231098-2614715072-2050932820-1107 
 allow_token_document:ad:S-1-1-0 -deny_token_document:ad:S-1-1-0)

This is the _document security chunk of the BooleanQuery (quoting all the SIDs 
with   so it doesn't think active_dir is a field only for having a : after 
it). The query gives the expected results.

Thinking about it, the truth is that when we configured our security policies 
by means of ActiveDirectory we did not take into consideration share-level 
policies. Our users are authenticated only at a document level. Anyway, I don't 
think this gives us any clue on why my handler isn't working.

But, now I could modify my own component to take care  of the _document-level 
security alone, forgetting about the _share-level. I think it would work and 
that's what I will try for now, but I seriously think there must be another way 
to do it, so if this data makes you have any idea please let me know.

I will anyway tell you whether it worked or not.

Thanks,

Pablo


-Original Message-
From: Karl Wright [mailto:daddy...@gmail.com] 
Sent: lunes, 05 de noviembre de 2012 11:57
To: user@manifoldcf.apache.org
Subject: Re: Problem with manifold

Just reran the tests on the trunk version of the ManifoldCF solr 3.x plugin - 
looked good:

[junit] Testsuite: org.apache.solr.mcf.ManifoldCFQParserPluginTest
[junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed: 10.56 sec
[junit]
[junit] - Standard Error -
[junit] WARNING: test class left thread running: 
Thread[MultiThreadedHttpCon nectionManager cleanup,5,main]
[junit] RESOURCE LEAK: test class left 1 thread(s) running
[junit] -  ---
[junit] Testsuite: org.apache.solr.mcf.ManifoldCFSearchComponentTest
[junit] Tests run: 4, Failures: 0, Errors: 0, Time elapsed: 2.096 sec
[junit]
[junit] - Standard Error -
[junit] WARNING: test class left thread running: 
Thread[MultiThreadedHttpCon nectionManager cleanup,5,main]
[junit] RESOURCE LEAK: test class left 1 thread(s) running
[junit] -  ---
[junit] Testsuite: org.apache.solr.mcf.ManifoldCFSCLoadTest
[junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 40.486 sec
[junit]
[junit] - Standard Output ---
[junit] Query time = 24352
[junit] -  ---
[junit] - Standard Error -
[junit] WARNING: test class left thread running: 
Thread[MultiThreadedHttpCon nectionManager cleanup,5,main]
[junit] RESOURCE LEAK: test class left 1 thread(s) running
[junit] -  ---


The components that this test uses are simple:

?xml version=1.0 ?

!--
 Licensed to the Apache Software Foundation (ASF) under one or more  
contributor license agreements.  See the NOTICE file distributed with  this 
work for additional information regarding copyright ownership.
 The ASF licenses this file to You under the Apache License, Version 2.0  (the 
License); you may not use this file except in compliance with  the License.  
You may obtain a copy of the License at

 http://www.apache.org/licenses/LICENSE-2.0

 Unless required by applicable law or agreed to in writing, software  
distributed under the License is distributed on an AS IS BASIS,  WITHOUT 
WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 See the License for the specific language governing permissions and  
limitations under the License.
--

!-- $Id: solrconfig-auth.xml 1176500 2011-09-27 18:19:59Z kwright $
 $Source$
 $Name$
  --

config

  
luceneMatchVersion${tests.luceneMatchVersion:LUCENE_CURRENT}/luceneMatchVersion
  jmx /

  dataDir${solr.data.dir:}/dataDir

  directoryFactory name=DirectoryFactory
class=${solr.directoryFactory:solr.RAMDirectoryFactory}/

  updateHandler class=solr.DirectUpdateHandler2
  /updateHandler

  requestHandler name=/update class=solr.XmlUpdateRequestHandler /

  !-- test MCF Security Filter

Re: Problem with manifold

2012-11-07 Thread Karl Wright
-2050932820-
  
 -deny_token_document:active_dir:S-1-5-21-2039231098-2614715072-2050932820-
  
 allow_token_document:active_dir:S-1-5-21-2039231098-2614715072-2050932820-513
  
 -deny_token_document:active_dir:S-1-5-21-2039231098-2614715072-2050932820-513
  
 allow_token_document:active_dir:S-1-5-21-2039231098-2614715072-2050932820-1113
  
 -deny_token_document:active_dir:S-1-5-21-2039231098-2614715072-2050932820-1113
  
 allow_token_document:active_dir:S-1-5-21-2039231098-2614715072-2050932820-1110
  
 -deny_token_document:active_dir:S-1-5-21-2039231098-2614715072-2050932820-1110
  
 allow_token_document:active_dir:S-1-5-21-2039231098-2614715072-2050932820-1107
  
 -deny_token_document:active_dir:S-1-5-21-2039231098-2614715072-2050932820-1107
  allow_token_document:active_dir:S-1-1-0 
 -deny_token_document:active_dir:S-1-1-0
  allow_token_document:ad:S-1-5-32-545 -deny_token_document:ad:S-1-5-32-545
  allow_token_document:ad:S-1-5-21-2039231098-2614715072-2050932820- 
 -deny_token_document:ad:S-1-5-21-2039231098-2614715072-2050932820-
  allow_token_document:ad:S-1-5-21-2039231098-2614715072-2050932820-513 
 -deny_token_document:ad:S-1-5-21-2039231098-2614715072-2050932820-513
  allow_token_document:ad:S-1-5-21-2039231098-2614715072-2050932820-1113 
 -deny_token_document:ad:S-1-5-21-2039231098-2614715072-2050932820-1113
  allow_token_document:ad:S-1-5-21-2039231098-2614715072-2050932820-1110 
 -deny_token_document:ad:S-1-5-21-2039231098-2614715072-2050932820-1110
  allow_token_document:ad:S-1-5-21-2039231098-2614715072-2050932820-1107 
 -deny_token_document:ad:S-1-5-21-2039231098-2614715072-2050932820-1107
  allow_token_document:ad:S-1-1-0 -deny_token_document:ad:S-1-1-0)

 This is the _document security chunk of the BooleanQuery (quoting all the 
 SIDs with   so it doesn't think active_dir is a field only for having a : 
 after it). The query gives the expected results.

 Thinking about it, the truth is that when we configured our security policies 
 by means of ActiveDirectory we did not take into consideration share-level 
 policies. Our users are authenticated only at a document level. Anyway, I 
 don't think this gives us any clue on why my handler isn't working.

 But, now I could modify my own component to take care  of the _document-level 
 security alone, forgetting about the _share-level. I think it would work and 
 that's what I will try for now, but I seriously think there must be another 
 way to do it, so if this data makes you have any idea please let me know.

 I will anyway tell you whether it worked or not.

 Thanks,

 Pablo


 -Original Message-
 From: Karl Wright [mailto:daddy...@gmail.com]
 Sent: lunes, 05 de noviembre de 2012 11:57
 To: user@manifoldcf.apache.org
 Subject: Re: Problem with manifold

 Just reran the tests on the trunk version of the ManifoldCF solr 3.x plugin - 
 looked good:

 [junit] Testsuite: org.apache.solr.mcf.ManifoldCFQParserPluginTest
 [junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed: 10.56 sec
 [junit]
 [junit] - Standard Error -
 [junit] WARNING: test class left thread running: 
 Thread[MultiThreadedHttpCon nectionManager cleanup,5,main]
 [junit] RESOURCE LEAK: test class left 1 thread(s) running
 [junit] -  ---
 [junit] Testsuite: org.apache.solr.mcf.ManifoldCFSearchComponentTest
 [junit] Tests run: 4, Failures: 0, Errors: 0, Time elapsed: 2.096 sec
 [junit]
 [junit] - Standard Error -
 [junit] WARNING: test class left thread running: 
 Thread[MultiThreadedHttpCon nectionManager cleanup,5,main]
 [junit] RESOURCE LEAK: test class left 1 thread(s) running
 [junit] -  ---
 [junit] Testsuite: org.apache.solr.mcf.ManifoldCFSCLoadTest
 [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 40.486 sec
 [junit]
 [junit] - Standard Output ---
 [junit] Query time = 24352
 [junit] -  ---
 [junit] - Standard Error -
 [junit] WARNING: test class left thread running: 
 Thread[MultiThreadedHttpCon nectionManager cleanup,5,main]
 [junit] RESOURCE LEAK: test class left 1 thread(s) running
 [junit] -  ---


 The components that this test uses are simple:

 ?xml version=1.0 ?

 !--
  Licensed to the Apache Software Foundation (ASF) under one or more  
 contributor license agreements.  See the NOTICE file distributed with  this 
 work for additional information regarding copyright ownership.
  The ASF licenses this file to You under the Apache License, Version 2.0  
 (the License); you may not use this file except in compliance with  the 
 License.  You may obtain a copy of the License at

  http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed

RE: Problem with manifold

2012-11-07 Thread Karl Wright
Hi Pablo,

Yes, I don't think you included the schema before. Having a default of
_nosecurity_ is critical. Were the instructions unclear?

And yes, this is safe, because all that it does is effectively
guarantee that solr fields without any value get one that can be
queried on.

Karl

Karl

Sent from my Windows Phone
From: Gonzalez, Pablo
Sent: 11/7/2012 6:08 AM
To: user@manifoldcf.apache.org
Subject: RE: Problem with manifold
Well, I did two things:
-first I did what I told you in the last message: I changed my
component only to care about the document-level security, and that way
the query worked
-then I realized that the documents that I indexed only had _document
tokens, not _share tokens at all. THAT is the real problem. So, what I
did was to change the definition of the fields in this way:
   field name=allow_token_document type=string indexed=true
stored=true multiValued=true required=true
default=__nosecurity__/
   field name=deny_token_document type=string indexed=true
stored=true multiValued=true required=true
default=__nosecurity__/
   field name=allow_token_share type=string indexed=true
stored=true multiValued=true required=true
default=__nosecurity__/
   field name=deny_token_share type=string indexed=true
stored=true multiValued=true required=true
default=__nosecurity__/

Then I used the default /select handler and it worked. But my question
is: is this safe? What I think this means is: if the document that I'm
indexing has no share security restrictions, then set it to no
security and let the user access it only if its document-level
policies allow him to do so.

Thinking about the system-not-indexing-share-tokens issue, I am
wondering what could be the cause. Maybe it is an error that I have in
my manifold or solr configurations that strips all the share tokens,
or perhaps we should do something at the machine that contains the
documents that we are indexing, to configure share-level security as
we did at the document level.

-Original Message-
From: Karl Wright [mailto:daddy...@gmail.com]
Sent: miércoles, 07 de noviembre de 2012 11:42
To: user@manifoldcf.apache.org
Subject: Re: Problem with manifold

So, can you look at one document, and tell me what the allow and deny
tokens are for both document and share levels?

Just taking the share part of the clause away means that you will be
allowing people to see search results when they cannot see within the
corresponding Windows share (according to Active Directory).  I'm
hoping that you are just crawling through a different share than the
one your users use to access the document.  But in any case the URLs
that are indexed will also not work to reach the files in question
because the share restrictions.

Karl

On Wed, Nov 7, 2012 at 4:20 AM, Gonzalez, Pablo
pablo.gonzalez.do...@hp.com wrote:
 Hello Karl, this is what I've done:
 -I've modified the class so that it prints out the BooleanQuery that it 
 creates.
 -I've rerun the query (with my handler), and this is what it pumps out:

 +((+allow_token_share:__nosecurity__ +deny_token_share:__nosecurity__)
  allow_token_share:active_dir:S-1-5-32-545
 -deny_token_share:active_dir:S-1-5-32-545

 allow_token_share:active_dir:S-1-5-21-2039231098-2614715072-2050932820
 -
 -deny_token_share:active_dir:S-1-5-21-2039231098-2614715072-2050932820
 -

 allow_token_share:active_dir:S-1-5-21-2039231098-2614715072-2050932820
 -513
 -deny_token_share:active_dir:S-1-5-21-2039231098-2614715072-2050932820
 -513

 allow_token_share:active_dir:S-1-5-21-2039231098-2614715072-2050932820
 -1113
 -deny_token_share:active_dir:S-1-5-21-2039231098-2614715072-2050932820
 -1113

 allow_token_share:active_dir:S-1-5-21-2039231098-2614715072-2050932820
 -1110
 -deny_token_share:active_dir:S-1-5-21-2039231098-2614715072-2050932820
 -1110

 allow_token_share:active_dir:S-1-5-21-2039231098-2614715072-2050932820
 -1107
 -deny_token_share:active_dir:S-1-5-21-2039231098-2614715072-2050932820
 -1107
  allow_token_share:active_dir:S-1-1-0
 -deny_token_share:active_dir:S-1-1-0
  allow_token_share:ad:S-1-5-32-545 -deny_token_share:ad:S-1-5-32-545
  allow_token_share:ad:S-1-5-21-2039231098-2614715072-2050932820-
 -deny_token_share:ad:S-1-5-21-2039231098-2614715072-2050932820-
  allow_token_share:ad:S-1-5-21-2039231098-2614715072-2050932820-513
 -deny_token_share:ad:S-1-5-21-2039231098-2614715072-2050932820-513
  allow_token_share:ad:S-1-5-21-2039231098-2614715072-2050932820-1113
 -deny_token_share:ad:S-1-5-21-2039231098-2614715072-2050932820-1113
  allow_token_share:ad:S-1-5-21-2039231098-2614715072-2050932820-1110
 -deny_token_share:ad:S-1-5-21-2039231098-2614715072-2050932820-1110
  allow_token_share:ad:S-1-5-21-2039231098-2614715072-2050932820-1107
 -deny_token_share:ad:S-1-5-21-2039231098-2614715072-2050932820-1107
  allow_token_share:ad:S-1-1-0 -deny_token_share:ad:S-1-1-0)
 +((+allow_token_document:__nosecurity__
 +deny_token_document:__nosecurity__)
  allow_token_document:active_dir:S-1-5-32-545

Re: Problem with manifold

2012-11-05 Thread Karl Wright
Just reran the tests on the trunk version of the ManifoldCF solr 3.x
plugin - looked good:

[junit] Testsuite: org.apache.solr.mcf.ManifoldCFQParserPluginTest
[junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed: 10.56 sec
[junit]
[junit] - Standard Error -
[junit] WARNING: test class left thread running: Thread[MultiThreadedHttpCon
nectionManager cleanup,5,main]
[junit] RESOURCE LEAK: test class left 1 thread(s) running
[junit] -  ---
[junit] Testsuite: org.apache.solr.mcf.ManifoldCFSearchComponentTest
[junit] Tests run: 4, Failures: 0, Errors: 0, Time elapsed: 2.096 sec
[junit]
[junit] - Standard Error -
[junit] WARNING: test class left thread running: Thread[MultiThreadedHttpCon
nectionManager cleanup,5,main]
[junit] RESOURCE LEAK: test class left 1 thread(s) running
[junit] -  ---
[junit] Testsuite: org.apache.solr.mcf.ManifoldCFSCLoadTest
[junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 40.486 sec
[junit]
[junit] - Standard Output ---
[junit] Query time = 24352
[junit] -  ---
[junit] - Standard Error -
[junit] WARNING: test class left thread running: Thread[MultiThreadedHttpCon
nectionManager cleanup,5,main]
[junit] RESOURCE LEAK: test class left 1 thread(s) running
[junit] -  ---


The components that this test uses are simple:

?xml version=1.0 ?

!--
 Licensed to the Apache Software Foundation (ASF) under one or more
 contributor license agreements.  See the NOTICE file distributed with
 this work for additional information regarding copyright ownership.
 The ASF licenses this file to You under the Apache License, Version 2.0
 (the License); you may not use this file except in compliance with
 the License.  You may obtain a copy of the License at

 http://www.apache.org/licenses/LICENSE-2.0

 Unless required by applicable law or agreed to in writing, software
 distributed under the License is distributed on an AS IS BASIS,
 WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 See the License for the specific language governing permissions and
 limitations under the License.
--

!-- $Id: solrconfig-auth.xml 1176500 2011-09-27 18:19:59Z kwright $
 $Source$
 $Name$
  --

config

  
luceneMatchVersion${tests.luceneMatchVersion:LUCENE_CURRENT}/luceneMatchVersion
  jmx /

  dataDir${solr.data.dir:}/dataDir

  directoryFactory name=DirectoryFactory
class=${solr.directoryFactory:solr.RAMDirectoryFactory}/

  updateHandler class=solr.DirectUpdateHandler2
  /updateHandler

  requestHandler name=/update class=solr.XmlUpdateRequestHandler /

  !-- test MCF Security Filter settings --
  searchComponent name=mcf-param
class=org.apache.solr.mcf.ManifoldCFSearchComponent 
str name=AuthorityServiceBaseURLhttp://localhost:8345/mcf-as/str
int name=SocketTimeOut3000/int
str name=AllowAttributePrefixaap-/str
str name=DenyAttributePrefixdap-/str
  /searchComponent

  searchComponent name=mcf
class=org.apache.solr.mcf.ManifoldCFSearchComponent 
  /searchComponent

  requestHandler name=/mcf class=solr.SearchHandler startup=lazy
lst name=invariants
  bool name=mcftrue/bool
/lst
lst name=defaults
  str name=echoParamsall/str
/lst
arr name=components
  strquery/str
  strmcf/str
/arr
  /requestHandler

/config



On Mon, Nov 5, 2012 at 5:42 AM, Karl Wright daddy...@gmail.com wrote:
 No - I mean modifying ManifoldCFSearchComponent itself, and rebuilding
 the component yourself.  You can download the sources that correspond
 to the release from the ManifoldCF download page,
 http://manifoldcf.apache.org/en_US/download.html .

 Karl

 On Mon, Nov 5, 2012 at 4:13 AM, Gonzalez, Pablo
 pablo.gonzalez.do...@hp.com wrote:
 Hello,

 By 'modifying the component itself' do you mean to write a subclass of 
 ManifoldCFSearchComponent?

 -Original Message-
 From: Karl Wright [mailto:daddy...@gmail.com]
 Sent: viernes, 02 de noviembre de 2012 14:47
 To: user@manifoldcf.apache.org
 Subject: Re: Problem with manifold

 If you don't get anywhere with the debug component, you can try modifying 
 the component itself to print the incoming query and the modified query.  
 You might also want to look at the ManifoldCF component tests, which create 
 a handler internally and executed successfully when the component was 
 released.  If you create a similar handler and that works, then you can try 
 to figure out what the differences are.

 Thanks,
 Karl

 On Fri, Nov 2, 2012 at 8:29 AM, Gonzalez, Pablo 
 pablo.gonzalez.do...@hp.com wrote:
 Well, it went wrong. I will crawl again just in case, and if it doesn't go 
 well, I will search on Internet about that debug component

RE: Problem with manifold

2012-11-02 Thread Gonzalez, Pablo
Hello, Mr Wright, and thank you for such a fast response. Well, the way I am 
using to try and communicate mcf and solr is via a SearchComponent. For this I 
added the apache-solr-mcf-3.6-SNAPSHOT.jar that comes in the file 
solr-integration to the lib folder of the deployment of the solr webapp in 
tomcat. Then I changed solrconfig.xml, adding this piece of code:


!-- LCF document security enforcement component --
searchComponent name=mcfSecurity 
class=org.apache.solr.mcf.ManifoldCFSearchComponent
str name=AuthorityServiceBaseURLhttp://localhost:8345/mcf/str
/searchComponent


requestHandler name=/search class=solr.SearchHandler default=true

!-- default values for query parameters can be specified, these

 will be overridden by parameters in the request

  --

   !--  lst name=defaults

   str name=echoParamsexplicit/str

   int name=rows10/int

   str name=dftext/str

 /lst--

arr name=last-components
strmcfSecurity/str
/arr
!--a bunch of comments--
/requestHandler

Last thing, I didn't write any additional Java code. I thought it wasn't 
necessary.

Thanks,

Pablo


-Original Message-
From: Karl Wright [mailto:daddy...@gmail.com] 
Sent: viernes, 02 de noviembre de 2012 10:21
To: user@manifoldcf.apache.org
Subject: Re: Problem with manifold

The ManifoldCF Solr plugin operates by requesting access tokens from ManifoldCF 
(which seems to be working fine), and using those to modify the incoming Solr 
search expression to limit the results according to those access tokens.

There are two ways (and two independent classes) you can configure to perform 
this modification.  One of these classes functions as a query parser plugin.  
The other functions as a search component.  Obviously, for either one to work 
right, the Solr configuration has to work properly too.  Can you provide 
details as to (a) which one you are using, and (b) what the configuration 
details are, e.g. the appropriate clauses from solrconfig.xml?

Thanks,
Karl

On Fri, Nov 2, 2012 at 4:57 AM, Gonzalez, Pablo pablo.gonzalez.do...@hp.com 
wrote:
 Hello,
 I don't know if you already got this message, but anyway here I go:
 I have been trying to connect ManifoldCF to Solr. I have a file system 
 in a remote server, protected by active directory.
 I have configured a manifold job to import only a part of the 
 documents under the file system. In fact, I do the importing process 
 from a file which only contains 2 documents, in order to make it 
 easier to see what is happening and get conclusions. Afterwards the 
 documents are output to the solr server.
 I have created a request handler called selectManifold to connect
 manifold and solr. Then I call it via
 http://[host]:8080/solr/selectManifold?indent=onversion=2.2q=*%3A*f
 q=start=0rows=10fl=*%2Cscorewt=explainOther=hl.fl=Authenticated
 UserName=user@domain . When doing this, tomcat's log (catalina.out) 
 writes this:
 oct 31, 2012 2:40:33 PM org.apache.solr.mcf.ManifoldCFSearchComponent
 prepare
 Información: Trying to match docs for user 'user@domain'
 oct 31, 2012 2:40:33 PM org.apache.solr.mcf.ManifoldCFSearchComponent
 getAccessTokens
 Información: For user 'user@domain', saw authority response 
 AUTHORIZED:Auth+active+directory+para+el+file+system (this one is the 
 active directory I'm currently using for the job) oct 31, 2012 2:40:33 
 PM org.apache.solr.mcf.ManifoldCFSearchComponent
 getAccessTokens
 Información: For user 'user@domain', saw authority response 
 AUTHORIZED:ad (this one isn't) oct 31, 2012 2:40:33 PM 
 org.apache.solr.core.SolrCore execute
 Información: [] webapp=/solr path=/selectManifold 
 params={explainOther=fl=*,scoreindent=onstart=0q=*:*hl.fl=wt=fq
 =version=2.2rows=10AuthenticatedUserName=user@domain}
 hits=0 status=0 QTime=183
 So, it effectively connects and gets my user's tokens. In fact, if I 
 go to http://[host]/mcf/UserACLs?username=user@domain, this is the 
 result:AUTHORIZED:Auth+active+directory+para+el+file+system
 TOKEN:active_dir:S-1-5-32-545
 TOKEN:active_dir:S-1-5-21-2039231098-2614715072-2050932820-
 TOKEN:active_dir:S-1-5-21-2039231098-2614715072-2050932820-513
 TOKEN:active_dir:S-1-5-21-2039231098-2614715072-2050932820-1113
 TOKEN:active_dir:S-1-5-21-2039231098-2614715072-2050932820-1110
 TOKEN:active_dir:S-1-5-21-2039231098-2614715072-2050932820-1107
 TOKEN:active_dir:S-1-1-0
 AUTHORIZED:ad
 TOKEN:ad:S-1-5-32-545
 TOKEN:ad:S-1-5-21-2039231098-2614715072-2050932820-
 TOKEN:ad:S-1-5-21-2039231098-2614715072-2050932820-513
 TOKEN:ad:S-1-5-21-2039231098-2614715072-2050932820-1113
 TOKEN:ad:S-1-5-21-2039231098-2614715072-2050932820-1110
 TOKEN:ad:S-1-5-21-2039231098-2614715072-2050932820-1107
 TOKEN:ad:S-1-1-0
 Moreover, if I go to http://[host]:8080/solr/admin/schema.jsp and 
 search for the allow_token_document field, it says that
 active_dir:S-1-5-21-2039231098-2614715072-2050932820-1110
 (which appeared in the list of UserACLs) has frequency 2 (remember I 
 only have 2 documents indexed

Re: Problem with manifold

2012-11-02 Thread Karl Wright
Actually, from your log it is clear that ManifoldCF can be reached
fine from your Solr instance, so please disregard that question.

The only other potential issue has to do with Solr search component
ordering.  This is a bit of black magic, because other Solr components
may modify the request in ways which are potentially incompatible with
the ManifoldCF plugin.  So if you are sure your fields are all
correct, you might want to play around with the ordering of your
components to see if that makes any difference.

There used to be debug component you could also use which would print
out the (full) query and the results returned - that may also be
useful.

Thanks,
Karl

On Fri, Nov 2, 2012 at 6:25 AM, Karl Wright daddy...@gmail.com wrote:
 Hi Pablo,

 The first thing that I notice is that, as you have this configured,
 you need four fields declared in your schema as indexable fields:

 allow_token_document
 deny_token_document
 allow_token_share
 deny_token_share


 Do you have these fields declared, and did you have them all declared
 when you performed the crawl?

 Second, the way it is configured, the machine that is running Solr
 must be the same as the machine running ManifoldCF (because you used a
 localhost url).  Is this true?

 Thanks,
 Karl


 On Fri, Nov 2, 2012 at 5:43 AM, Gonzalez, Pablo
 pablo.gonzalez.do...@hp.com wrote:
 Hello, Mr Wright, and thank you for such a fast response. Well, the way I am 
 using to try and communicate mcf and solr is via a SearchComponent. For this 
 I added the apache-solr-mcf-3.6-SNAPSHOT.jar that comes in the file 
 solr-integration to the lib folder of the deployment of the solr webapp in 
 tomcat. Then I changed solrconfig.xml, adding this piece of code:


 !-- LCF document security enforcement component --
 searchComponent name=mcfSecurity 
 class=org.apache.solr.mcf.ManifoldCFSearchComponent
 str name=AuthorityServiceBaseURLhttp://localhost:8345/mcf/str
 /searchComponent


 requestHandler name=/search class=solr.SearchHandler default=true

 !-- default values for query parameters can be specified, these

  will be overridden by parameters in the request

   --

!--  lst name=defaults

str name=echoParamsexplicit/str

int name=rows10/int

str name=dftext/str

  /lst--

 arr name=last-components
 strmcfSecurity/str
 /arr
 !--a bunch of comments--
 /requestHandler

 Last thing, I didn't write any additional Java code. I thought it wasn't 
 necessary.

 Thanks,

 Pablo


 -Original Message-
 From: Karl Wright [mailto:daddy...@gmail.com]
 Sent: viernes, 02 de noviembre de 2012 10:21
 To: user@manifoldcf.apache.org
 Subject: Re: Problem with manifold

 The ManifoldCF Solr plugin operates by requesting access tokens from 
 ManifoldCF (which seems to be working fine), and using those to modify the 
 incoming Solr search expression to limit the results according to those 
 access tokens.

 There are two ways (and two independent classes) you can configure to 
 perform this modification.  One of these classes functions as a query parser 
 plugin.  The other functions as a search component.  Obviously, for either 
 one to work right, the Solr configuration has to work properly too.  Can you 
 provide details as to (a) which one you are using, and (b) what the 
 configuration details are, e.g. the appropriate clauses from solrconfig.xml?

 Thanks,
 Karl

 On Fri, Nov 2, 2012 at 4:57 AM, Gonzalez, Pablo 
 pablo.gonzalez.do...@hp.com wrote:
 Hello,
 I don't know if you already got this message, but anyway here I go:
 I have been trying to connect ManifoldCF to Solr. I have a file system
 in a remote server, protected by active directory.
 I have configured a manifold job to import only a part of the
 documents under the file system. In fact, I do the importing process
 from a file which only contains 2 documents, in order to make it
 easier to see what is happening and get conclusions. Afterwards the
 documents are output to the solr server.
 I have created a request handler called selectManifold to connect
 manifold and solr. Then I call it via
 http://[host]:8080/solr/selectManifold?indent=onversion=2.2q=*%3A*f
 q=start=0rows=10fl=*%2Cscorewt=explainOther=hl.fl=Authenticated
 UserName=user@domain . When doing this, tomcat's log (catalina.out)
 writes this:
 oct 31, 2012 2:40:33 PM org.apache.solr.mcf.ManifoldCFSearchComponent
 prepare
 Información: Trying to match docs for user 'user@domain'
 oct 31, 2012 2:40:33 PM org.apache.solr.mcf.ManifoldCFSearchComponent
 getAccessTokens
 Información: For user 'user@domain', saw authority response
 AUTHORIZED:Auth+active+directory+para+el+file+system (this one is the
 active directory I'm currently using for the job) oct 31, 2012 2:40:33
 PM org.apache.solr.mcf.ManifoldCFSearchComponent
 getAccessTokens
 Información: For user 'user@domain', saw authority response
 AUTHORIZED:ad (this one isn't) oct 31, 2012 2:40:33 PM
 org.apache.solr.core.SolrCore execute
 Información

RE: Problem with manifold

2012-11-02 Thread Gonzalez, Pablo
Ok, I already had the fields in my schema.xml. This is the piece of code 
regarding them:

   field name=allow_token_document type=string indexed=true 
stored=false multiValued=true/

   field name=deny_token_document type=string indexed=true 
stored=false multiValued=true/

   field name=allow_token_share type=string indexed=true stored=false 
multiValued=true/

   field name=deny_token_share type=string indexed=true stored=false 
multiValued=true/

So, just to make it clear, what you are suggesting is to cut the piece of code 
that contains my request handler and paste it in another part of the 
solrconfig.xml file, and try this a number of times. I will try to do so, and 
I'll tell you whether it went right or wrong.

-Original Message-
From: Karl Wright [mailto:daddy...@gmail.com] 
Sent: viernes, 02 de noviembre de 2012 11:38
To: user@manifoldcf.apache.org
Subject: Re: Problem with manifold

Actually, from your log it is clear that ManifoldCF can be reached fine from 
your Solr instance, so please disregard that question.

The only other potential issue has to do with Solr search component ordering.  
This is a bit of black magic, because other Solr components may modify the 
request in ways which are potentially incompatible with the ManifoldCF plugin.  
So if you are sure your fields are all correct, you might want to play around 
with the ordering of your components to see if that makes any difference.

There used to be debug component you could also use which would print out the 
(full) query and the results returned - that may also be useful.

Thanks,
Karl

On Fri, Nov 2, 2012 at 6:25 AM, Karl Wright daddy...@gmail.com wrote:
 Hi Pablo,

 The first thing that I notice is that, as you have this configured, 
 you need four fields declared in your schema as indexable fields:

 allow_token_document
 deny_token_document
 allow_token_share
 deny_token_share


 Do you have these fields declared, and did you have them all declared 
 when you performed the crawl?

 Second, the way it is configured, the machine that is running Solr 
 must be the same as the machine running ManifoldCF (because you used a 
 localhost url).  Is this true?

 Thanks,
 Karl


 On Fri, Nov 2, 2012 at 5:43 AM, Gonzalez, Pablo 
 pablo.gonzalez.do...@hp.com wrote:
 Hello, Mr Wright, and thank you for such a fast response. Well, the way I am 
 using to try and communicate mcf and solr is via a SearchComponent. For this 
 I added the apache-solr-mcf-3.6-SNAPSHOT.jar that comes in the file 
 solr-integration to the lib folder of the deployment of the solr webapp in 
 tomcat. Then I changed solrconfig.xml, adding this piece of code:


 !-- LCF document security enforcement component -- searchComponent 
 name=mcfSecurity 
 class=org.apache.solr.mcf.ManifoldCFSearchComponent
 str name=AuthorityServiceBaseURLhttp://localhost:8345/mcf/str
 /searchComponent


 requestHandler name=/search class=solr.SearchHandler 
 default=true

 !-- default values for query parameters can be specified, these

  will be overridden by parameters in the request

   --

!--  lst name=defaults

str name=echoParamsexplicit/str

int name=rows10/int

str name=dftext/str

  /lst--

 arr name=last-components
 strmcfSecurity/str
 /arr
 !--a bunch of comments--
 /requestHandler

 Last thing, I didn't write any additional Java code. I thought it wasn't 
 necessary.

 Thanks,

 Pablo


 -Original Message-
 From: Karl Wright [mailto:daddy...@gmail.com]
 Sent: viernes, 02 de noviembre de 2012 10:21
 To: user@manifoldcf.apache.org
 Subject: Re: Problem with manifold

 The ManifoldCF Solr plugin operates by requesting access tokens from 
 ManifoldCF (which seems to be working fine), and using those to modify the 
 incoming Solr search expression to limit the results according to those 
 access tokens.

 There are two ways (and two independent classes) you can configure to 
 perform this modification.  One of these classes functions as a query parser 
 plugin.  The other functions as a search component.  Obviously, for either 
 one to work right, the Solr configuration has to work properly too.  Can you 
 provide details as to (a) which one you are using, and (b) what the 
 configuration details are, e.g. the appropriate clauses from solrconfig.xml?

 Thanks,
 Karl

 On Fri, Nov 2, 2012 at 4:57 AM, Gonzalez, Pablo 
 pablo.gonzalez.do...@hp.com wrote:
 Hello,
 I don't know if you already got this message, but anyway here I go:
 I have been trying to connect ManifoldCF to Solr. I have a file 
 system in a remote server, protected by active directory.
 I have configured a manifold job to import only a part of the 
 documents under the file system. In fact, I do the importing process 
 from a file which only contains 2 documents, in order to make it 
 easier to see what is happening and get conclusions. Afterwards the 
 documents are output to the solr server.
 I have created a request handler called selectManifold