Re: Specifications of HopFilters "Keep unreachable documents"

2019-11-08 Thread Issei Nishigata

Hi Karl,


Thank you for a quick response.

It seems that I have completely misunderstood the specifications so it'd be 
helpful if you could show specific examples for each Hop count mode.

Is those below my understanding correct?
- "keep unreachable documents, for now" and "... forever" is the settings that 
does not delete documents from the index that were not crawled.
- hop count dependency information is like a cache of the link structure. This link structure is not recreated in "keep unreachable documents 
forever" mode, so it is faster to crawl.


The reason I am asking these question is a document was deleted that I thought 
it was not going to be.
Is there any way that it does not delete? What does it "keep" in "keep unreachable 
document"?


Sincerely,
Issei Nishigata



On 2019/11/08 2:19, Karl Wright wrote:

Hi Issei,

The setting of "Keep unreachable documents forever" basically means that no hop count dependency information is kept around for any crawls done 
when that setting is in place.  That means that when links change or documents change the system does not know how to recompute the hopcount 
accurately.  This setting is appropriate if you want your crawl to be as fast as possible and do not expect ever to use hop count filtering for 
the job in question.


The "keep unreachable documents for now" means that enough information is kept around that if you decided to put a hop count filter into place 
later, it would still work properly.


Hope that helps.

Karl


On Thu, Nov 7, 2019 at 11:01 AM Issei Nishigata mailto:duo.2...@gmail.com>> wrote:

Hi All,


I use MCF2.12, and I have confused about specifications of HopFilters "Keep 
unreachable documents".

I understand that the "Keep unrechable documents, for now" and "Keep 
unreacheable documents, forever" of HopFilter
is an effective setting when specifying HopCount.

For example, crawling all data with specifying the empty value on HopCount 
at first time, and the second time,
putting 0 in the value of HopCount with "Keep unreachable documents, for 
now", only the first layer of the directory
will be crawled and the second and deeper layers, which are not crawled, 
will not be deleted from the index.

However, when actually processing as the above setting, document on second 
layer is deleted from index
when processing second time and after that. It works same way when using "Keep 
unreacheable documents, forever".

Is there anything wrong with my understanding? and Does anyone know about 
difference between these two settings,
"Keep unrechable documents, for now" and "Keep unreacheable documents, 
forever"?

If anyone of you knows about the specs of these settings, then it is very 
helpful to share your bits of advice.
Any clue will be very appreciated.


Sincerely,
Issei Nishigata



Re: Windows shares connector-Error

2019-11-08 Thread Karl Wright
(1) Download source distribution and lib distribution
(2) Unpack and follow directions for placing lib folder in place
(3) Run 'ant make-deps' to download the correct version of jcifs
(4) Run "ant build" to make a distribution that includes proprietary
examples
(5) Use the proprietary example you need

The reason this might be a good idea is because we no longer use the older
versions of jcifs, but a newer one with some fixes instead.

Karl


On Fri, Nov 8, 2019 at 7:04 AM Priya Arora  wrote:

> This didn't work even. Is that(manifoldcf version 2.14) something to do
> with java version also. If yes , I am using JAVA_HOME :_ java version 8.
> Can you suggest something
>
> On Fri, Nov 8, 2019 at 4:16 PM Sreejith Variyath <
> sreejith.variy...@tarams.com> wrote:
>
>> place the jcifs.jar into the *connector-lib-proprietary* directory
>>
>> On Fri, Nov 8, 2019 at 2:38 PM Priya Arora  wrote:
>>
>>> Hi All
>>>
>>> I installed the 2.14 version of manifoldcf , then uncommented the line
>>> in connectors.xml file ">> =" org.apache.manifoldcf.crawler.connectors.sharedrive.SharedDriveConnector
>>> "/>" , but when I try to start with(java- jar start.jar) gives error:
>>>
>>> I also checked it mcf-jcifs-connector.jar is also present in
>>> connector-lib.
>>>
>>> Do i need to do something else also.Here is the error log.
>>>
>>> Successfully registered repository connector
>>> 'org.apache.manifoldcf.crawler.connectors.jdbc.JDBCConnector'
>>> Exception in thread "main" java.lang.NoClassDefFoundError:
>>> jcifs/smb/SmbException
>>> at java.base/java.lang.Class.forName0(Native Method)
>>> at java.base/java.lang.Class.forName(Unknown Source)
>>> at
>>> org.apache.manifoldcf.core.system.ManifoldCFResourceLoader.findClass(ManifoldCFResourceLoader.java:149)
>>> at
>>> org.apache.manifoldcf.core.system.ManifoldCF.findClass(ManifoldCF.java:1533)
>>> at
>>> org.apache.manifoldcf.core.interfaces.ConnectorFactory.getThisConnectorRaw(ConnectorFactory.java:144)
>>> at
>>> org.apache.manifoldcf.core.interfaces.ConnectorFactory.getThisConnectorNoCheck(ConnectorFactory.java:118)
>>> at
>>> org.apache.manifoldcf.core.interfaces.ConnectorFactory.installThis(ConnectorFactory.java:48)
>>> at
>>> org.apache.manifoldcf.crawler.interfaces.RepositoryConnectorFactory.install(RepositoryConnectorFactory.java:100)
>>> at
>>> org.apache.manifoldcf.crawler.connmgr.ConnectorManager.registerConnector(ConnectorManager.java:180)
>>> at
>>> org.apache.manifoldcf.crawler.system.ManifoldCF.registerConnectors(ManifoldCF.java:672)
>>> at
>>> org.apache.manifoldcf.crawler.system.ManifoldCF.reregisterAllConnectors(ManifoldCF.java:160)
>>> at
>>> org.apache.manifoldcf.jettyrunner.ManifoldCFJettyRunner.main(ManifoldCFJettyRunner.java:239)
>>> Caused by: java.lang.ClassNotFoundException: jcifs.smb.SmbException
>>> at java.base/java.net.URLClassLoader.findClass(Unknown Source)
>>> at java.base/java.lang.ClassLoader.loadClass(Unknown Source)
>>> at java.base/java.net.FactoryURLClassLoader.loadClass(Unknown
>>> Source)
>>> at java.base/java.lang.ClassLoader.loadClass(Unknown Source)
>>> ... 12 more
>>>
>>> Thanks and regards
>>> Priya
>>>
>>
>>
>> --
>> Best Regards,
>>
>>
>> *Sreejith Variyath*
>> Lead Software Engineer
>> Tarams Software Technologies Pvt. Ltd.
>> Venus Buildings, 2nd Floor 1/2,3rd Main,
>> Kalyanamantapa Road Jakasandra, 1st Block Kormangala
>> Bangalore - 560034
>> Tarams 
>>
>>
>> www.tarams.com
>> =
>> DISCLAIMER: The information in this message is confidential and may be
>> legally privileged. It is intended solely for the addressee. Access to this
>> message by anyone else is unauthorized. If you are not the intended
>> recipient, any disclosure, copying, or distribution of the message, or any
>> action or omission taken by you in reliance on it, is prohibited and may be
>> unlawful. Please immediately contact the sender if you have received this
>> message in error. Further, this e-mail may contain viruses and all
>> reasonable precaution to minimize the risk arising there from is taken by
>> Tarams. Tarams is not liable for any damage sustained by you as a result of
>> any virus in this e-mail. All applicable virus checks should be carried out
>> by you before opening this e-mail or any attachment thereto.
>> Thank you - Tarams Software Technologies Pvt.Ltd.
>> =
>>
>


Re: Windows shares connector-Error

2019-11-08 Thread Priya Arora
This didn't work even. Is that(manifoldcf version 2.14) something to do
with java version also. If yes , I am using JAVA_HOME :_ java version 8.
Can you suggest something

On Fri, Nov 8, 2019 at 4:16 PM Sreejith Variyath <
sreejith.variy...@tarams.com> wrote:

> place the jcifs.jar into the *connector-lib-proprietary* directory
>
> On Fri, Nov 8, 2019 at 2:38 PM Priya Arora  wrote:
>
>> Hi All
>>
>> I installed the 2.14 version of manifoldcf , then uncommented the line in
>> connectors.xml file "" , but when I try to start with(java- jar start.jar) gives error:
>>
>> I also checked it mcf-jcifs-connector.jar is also present in
>> connector-lib.
>>
>> Do i need to do something else also.Here is the error log.
>>
>> Successfully registered repository connector
>> 'org.apache.manifoldcf.crawler.connectors.jdbc.JDBCConnector'
>> Exception in thread "main" java.lang.NoClassDefFoundError:
>> jcifs/smb/SmbException
>> at java.base/java.lang.Class.forName0(Native Method)
>> at java.base/java.lang.Class.forName(Unknown Source)
>> at
>> org.apache.manifoldcf.core.system.ManifoldCFResourceLoader.findClass(ManifoldCFResourceLoader.java:149)
>> at
>> org.apache.manifoldcf.core.system.ManifoldCF.findClass(ManifoldCF.java:1533)
>> at
>> org.apache.manifoldcf.core.interfaces.ConnectorFactory.getThisConnectorRaw(ConnectorFactory.java:144)
>> at
>> org.apache.manifoldcf.core.interfaces.ConnectorFactory.getThisConnectorNoCheck(ConnectorFactory.java:118)
>> at
>> org.apache.manifoldcf.core.interfaces.ConnectorFactory.installThis(ConnectorFactory.java:48)
>> at
>> org.apache.manifoldcf.crawler.interfaces.RepositoryConnectorFactory.install(RepositoryConnectorFactory.java:100)
>> at
>> org.apache.manifoldcf.crawler.connmgr.ConnectorManager.registerConnector(ConnectorManager.java:180)
>> at
>> org.apache.manifoldcf.crawler.system.ManifoldCF.registerConnectors(ManifoldCF.java:672)
>> at
>> org.apache.manifoldcf.crawler.system.ManifoldCF.reregisterAllConnectors(ManifoldCF.java:160)
>> at
>> org.apache.manifoldcf.jettyrunner.ManifoldCFJettyRunner.main(ManifoldCFJettyRunner.java:239)
>> Caused by: java.lang.ClassNotFoundException: jcifs.smb.SmbException
>> at java.base/java.net.URLClassLoader.findClass(Unknown Source)
>> at java.base/java.lang.ClassLoader.loadClass(Unknown Source)
>> at java.base/java.net.FactoryURLClassLoader.loadClass(Unknown
>> Source)
>> at java.base/java.lang.ClassLoader.loadClass(Unknown Source)
>> ... 12 more
>>
>> Thanks and regards
>> Priya
>>
>
>
> --
> Best Regards,
>
>
> *Sreejith Variyath*
> Lead Software Engineer
> Tarams Software Technologies Pvt. Ltd.
> Venus Buildings, 2nd Floor 1/2,3rd Main,
> Kalyanamantapa Road Jakasandra, 1st Block Kormangala
> Bangalore - 560034
> Tarams 
>
>
> www.tarams.com
> =
> DISCLAIMER: The information in this message is confidential and may be
> legally privileged. It is intended solely for the addressee. Access to this
> message by anyone else is unauthorized. If you are not the intended
> recipient, any disclosure, copying, or distribution of the message, or any
> action or omission taken by you in reliance on it, is prohibited and may be
> unlawful. Please immediately contact the sender if you have received this
> message in error. Further, this e-mail may contain viruses and all
> reasonable precaution to minimize the risk arising there from is taken by
> Tarams. Tarams is not liable for any damage sustained by you as a result of
> any virus in this e-mail. All applicable virus checks should be carried out
> by you before opening this e-mail or any attachment thereto.
> Thank you - Tarams Software Technologies Pvt.Ltd.
> =
>


Re: Windows shares connector-Error

2019-11-08 Thread Sreejith Variyath
place the jcifs.jar into the *connector-lib-proprietary* directory

On Fri, Nov 8, 2019 at 2:38 PM Priya Arora  wrote:

> Hi All
>
> I installed the 2.14 version of manifoldcf , then uncommented the line in
> connectors.xml file "" , but when I try to start with(java- jar start.jar) gives error:
>
> I also checked it mcf-jcifs-connector.jar is also present in connector-lib.
>
> Do i need to do something else also.Here is the error log.
>
> Successfully registered repository connector
> 'org.apache.manifoldcf.crawler.connectors.jdbc.JDBCConnector'
> Exception in thread "main" java.lang.NoClassDefFoundError:
> jcifs/smb/SmbException
> at java.base/java.lang.Class.forName0(Native Method)
> at java.base/java.lang.Class.forName(Unknown Source)
> at
> org.apache.manifoldcf.core.system.ManifoldCFResourceLoader.findClass(ManifoldCFResourceLoader.java:149)
> at
> org.apache.manifoldcf.core.system.ManifoldCF.findClass(ManifoldCF.java:1533)
> at
> org.apache.manifoldcf.core.interfaces.ConnectorFactory.getThisConnectorRaw(ConnectorFactory.java:144)
> at
> org.apache.manifoldcf.core.interfaces.ConnectorFactory.getThisConnectorNoCheck(ConnectorFactory.java:118)
> at
> org.apache.manifoldcf.core.interfaces.ConnectorFactory.installThis(ConnectorFactory.java:48)
> at
> org.apache.manifoldcf.crawler.interfaces.RepositoryConnectorFactory.install(RepositoryConnectorFactory.java:100)
> at
> org.apache.manifoldcf.crawler.connmgr.ConnectorManager.registerConnector(ConnectorManager.java:180)
> at
> org.apache.manifoldcf.crawler.system.ManifoldCF.registerConnectors(ManifoldCF.java:672)
> at
> org.apache.manifoldcf.crawler.system.ManifoldCF.reregisterAllConnectors(ManifoldCF.java:160)
> at
> org.apache.manifoldcf.jettyrunner.ManifoldCFJettyRunner.main(ManifoldCFJettyRunner.java:239)
> Caused by: java.lang.ClassNotFoundException: jcifs.smb.SmbException
> at java.base/java.net.URLClassLoader.findClass(Unknown Source)
> at java.base/java.lang.ClassLoader.loadClass(Unknown Source)
> at java.base/java.net.FactoryURLClassLoader.loadClass(Unknown
> Source)
> at java.base/java.lang.ClassLoader.loadClass(Unknown Source)
> ... 12 more
>
> Thanks and regards
> Priya
>


-- 
Best Regards,


*Sreejith Variyath*
Lead Software Engineer
Tarams Software Technologies Pvt. Ltd.
Venus Buildings, 2nd Floor 1/2,3rd Main,
Kalyanamantapa Road Jakasandra, 1st Block Kormangala
Bangalore - 560034
Tarams 

-- 
www.tarams.com     
=

DISCLAIMER:
 The information in this message is confidential and may be 
legally 
privileged. It is intended solely for the addressee. Access to 
this 
message by anyone else is unauthorized. If you are not the intended 

recipient, any disclosure, copying, or distribution of the message, or 
any 
action or omission taken by you in reliance on it, is prohibited and
 may 
be unlawful. Please immediately contact the sender if you have 
received 
this message in error. Further, this e-mail may contain viruses
 and all 
reasonable precaution to minimize the risk arising there from 
is taken by 
Tarams. Tarams is not liable for any damage sustained by you
 as a result 
of any virus in this e-mail. All applicable virus checks 
should be carried 
out by you before opening this e-mail or any 
attachment thereto.
Thank you 
- Tarams Software Technologies Pvt.Ltd.
=


Windows shares connector-Error

2019-11-08 Thread Priya Arora
Hi All

I installed the 2.14 version of manifoldcf , then uncommented the line in
connectors.xml file "" , but when I try to start with(java- jar start.jar) gives error:

I also checked it mcf-jcifs-connector.jar is also present in connector-lib.

Do i need to do something else also.Here is the error log.

Successfully registered repository connector
'org.apache.manifoldcf.crawler.connectors.jdbc.JDBCConnector'
Exception in thread "main" java.lang.NoClassDefFoundError:
jcifs/smb/SmbException
at java.base/java.lang.Class.forName0(Native Method)
at java.base/java.lang.Class.forName(Unknown Source)
at
org.apache.manifoldcf.core.system.ManifoldCFResourceLoader.findClass(ManifoldCFResourceLoader.java:149)
at
org.apache.manifoldcf.core.system.ManifoldCF.findClass(ManifoldCF.java:1533)
at
org.apache.manifoldcf.core.interfaces.ConnectorFactory.getThisConnectorRaw(ConnectorFactory.java:144)
at
org.apache.manifoldcf.core.interfaces.ConnectorFactory.getThisConnectorNoCheck(ConnectorFactory.java:118)
at
org.apache.manifoldcf.core.interfaces.ConnectorFactory.installThis(ConnectorFactory.java:48)
at
org.apache.manifoldcf.crawler.interfaces.RepositoryConnectorFactory.install(RepositoryConnectorFactory.java:100)
at
org.apache.manifoldcf.crawler.connmgr.ConnectorManager.registerConnector(ConnectorManager.java:180)
at
org.apache.manifoldcf.crawler.system.ManifoldCF.registerConnectors(ManifoldCF.java:672)
at
org.apache.manifoldcf.crawler.system.ManifoldCF.reregisterAllConnectors(ManifoldCF.java:160)
at
org.apache.manifoldcf.jettyrunner.ManifoldCFJettyRunner.main(ManifoldCFJettyRunner.java:239)
Caused by: java.lang.ClassNotFoundException: jcifs.smb.SmbException
at java.base/java.net.URLClassLoader.findClass(Unknown Source)
at java.base/java.lang.ClassLoader.loadClass(Unknown Source)
at java.base/java.net.FactoryURLClassLoader.loadClass(Unknown
Source)
at java.base/java.lang.ClassLoader.loadClass(Unknown Source)
... 12 more

Thanks and regards
Priya