[jira] [Resolved] (CONNECTORS-87) Connector Framework load test needs to be written

2011-09-19 Thread Karl Wright (JIRA)

 [ 
https://issues.apache.org/jira/browse/CONNECTORS-87?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright resolved CONNECTORS-87.
---

Resolution: Fixed

r1172580


> Connector Framework load test needs to be written
> -
>
> Key: CONNECTORS-87
> URL: https://issues.apache.org/jira/browse/CONNECTORS-87
> Project: ManifoldCF
>  Issue Type: Test
>  Components: Tests
>Affects Versions: ManifoldCF 0.1, ManifoldCF 0.2, ManifoldCF 0.3
>Reporter: Karl Wright
>Assignee: Karl Wright
> Fix For: ManifoldCF 0.4
>
>
> LCF needs a load or performance test, which verifies that the core software 
> is performing as expected.  This test can use the file system connector, but 
> must verify that individual throttle bins are getting approximately equal 
> time, and that the system as a whole is behaving efficiently.  Furthermore, 
> at least 1,000,000 documents should be crawled by this test.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (CONNECTORS-256) Connector for crawling Wikis

2011-09-19 Thread Karl Wright (JIRA)
Connector for crawling Wikis


 Key: CONNECTORS-256
 URL: https://issues.apache.org/jira/browse/CONNECTORS-256
 Project: ManifoldCF
  Issue Type: New Feature
Affects Versions: ManifoldCF 0.4
Reporter: Karl Wright
Assignee: Karl Wright
 Fix For: ManifoldCF 0.4


People have been trying to crawl wikis with ManifoldCF, but using the generic 
crawler is not a good way to do this.  Instead, it looks like we really could 
use a wiki connector, which would understand the wiki API and thus crawl wiki 
content quickly and effectively.

Some pertinent API references follow:

I don't know if it is possible to link to a wiki document with just the pageid, 
but it is possible to to get the url for the referring pageid via api:
http://en.wikipedia.org/w/api.php?action=query&prop=info&pageids=27697087&inprop=url

It is possible to get the metadata of a document using the pages id (instead of 
title) directly:
Titel -> 
http://en.wikipedia.org/w/api.php?action=query&prop=revisions&titles=API&rvprop=timestamp|user|comment|content
PageID -> 
http://en.wikipedia.org/w/api.php?action=query&prop=revisions&pageids=27697087&rvprop=timestamp|user|comment|content



- There needs to be some notion of an overall list of pages:
   - http://www.mediawiki.org/wiki/API:Allpages
   - Example: 
http://en.wikipedia.org/w/api.php?action=query&list=allpages&apfrom=Kre&aplimit=5

- Metadata information (author and pub date) also needs to be separated out in 
some way:
   - http://www.mediawiki.org/wiki/API:Properties#Revisions:_Example
   - Example:  
http://en.wikipedia.org/w/api.php?action=query&prop=revisions&titles=API|Main%20Page&rvprop=timestamp|user|comment|content



--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CONNECTORS-256) Connector for crawling Wikis

2011-09-19 Thread Karl Wright (JIRA)

 [ 
https://issues.apache.org/jira/browse/CONNECTORS-256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright updated CONNECTORS-256:
---

Component/s: Wiki connector

> Connector for crawling Wikis
> 
>
> Key: CONNECTORS-256
> URL: https://issues.apache.org/jira/browse/CONNECTORS-256
> Project: ManifoldCF
>  Issue Type: New Feature
>  Components: Wiki connector
>Affects Versions: ManifoldCF 0.4
>Reporter: Karl Wright
>Assignee: Karl Wright
> Fix For: ManifoldCF 0.4
>
>
> People have been trying to crawl wikis with ManifoldCF, but using the generic 
> crawler is not a good way to do this.  Instead, it looks like we really could 
> use a wiki connector, which would understand the wiki API and thus crawl wiki 
> content quickly and effectively.
> Some pertinent API references follow:
> I don't know if it is possible to link to a wiki document with just the 
> pageid, but it is possible to to get the url for the referring pageid via api:
> http://en.wikipedia.org/w/api.php?action=query&prop=info&pageids=27697087&inprop=url
> It is possible to get the metadata of a document using the pages id (instead 
> of title) directly:
> Titel -> 
> http://en.wikipedia.org/w/api.php?action=query&prop=revisions&titles=API&rvprop=timestamp|user|comment|content
> PageID -> 
> http://en.wikipedia.org/w/api.php?action=query&prop=revisions&pageids=27697087&rvprop=timestamp|user|comment|content
> - There needs to be some notion of an overall list of pages:
>- http://www.mediawiki.org/wiki/API:Allpages
>- Example: 
> http://en.wikipedia.org/w/api.php?action=query&list=allpages&apfrom=Kre&aplimit=5
> - Metadata information (author and pub date) also needs to be separated out 
> in some way:
>- http://www.mediawiki.org/wiki/API:Properties#Revisions:_Example
>- Example:  
> http://en.wikipedia.org/w/api.php?action=query&prop=revisions&titles=API|Main%20Page&rvprop=timestamp|user|comment|content

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




1.0 release, and graduation

2011-09-19 Thread Karl Wright
Folks,
I'd like to begin discussion about the next release, currently labeled
0.4, and also our potential for graduation from the incubator.  What
I'd like is a sense of:

(a) what we are still missing as far as incubator graduation is concerned, and
(b) what a 1.0 release might look like to everyone

Please try to be as concrete as possible.  My own personal goal is to
see this happen by the end of the year, more or less.  To that end
I've already begun triaging JIRA tickets for the 0.4 release that I
think would be appropriate for a 1.0 release.

It's entirely possible that some things that people feel strongly
about may not be doable in that time frame, but so be it.  This may
also be true of our status as a project.

Thanks,
Karl


[jira] [Updated] (CONNECTORS-257) Input-able response lifetime for Active Directory authority

2011-09-19 Thread Shinichiro Abe (JIRA)

 [ 
https://issues.apache.org/jira/browse/CONNECTORS-257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shinichiro Abe updated CONNECTORS-257:
--

Attachment: CONNECTORS-257-1.patch

> Input-able response lifetime for Active Directory authority
> ---
>
> Key: CONNECTORS-257
> URL: https://issues.apache.org/jira/browse/CONNECTORS-257
> Project: ManifoldCF
>  Issue Type: Improvement
>  Components: Active Directory authority
>Affects Versions: ManifoldCF 0.1, ManifoldCF 0.2, ManifoldCF 0.3
>Reporter: Shinichiro Abe
>Assignee: Shinichiro Abe
>Priority: Minor
> Fix For: ManifoldCF 0.4
>
> Attachments: CONNECTORS-257-1.patch
>
>
> The access tokens are cached for one minute, and up to 1000 different
> users' access tokens will be cached at any one time.
> The access token's cache per username remains idle before expiring.
> Its expiration time depends response lifetime, and 
> expiration time is updated after looking up the cache.
> Currently response lifetime is 1 minute.
> Since I want to access Active Directory frequently, 
> I make this response lifetime to be input-able for users.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (CONNECTORS-257) Input-able response lifetime for Active Directory authority

2011-09-19 Thread Shinichiro Abe (JIRA)
Input-able response lifetime for Active Directory authority
---

 Key: CONNECTORS-257
 URL: https://issues.apache.org/jira/browse/CONNECTORS-257
 Project: ManifoldCF
  Issue Type: Improvement
  Components: Active Directory authority
Affects Versions: ManifoldCF 0.2, ManifoldCF 0.1, ManifoldCF 0.3
Reporter: Shinichiro Abe
Assignee: Shinichiro Abe
Priority: Minor
 Fix For: ManifoldCF 0.4
 Attachments: CONNECTORS-257-1.patch

The access tokens are cached for one minute, and up to 1000 different
users' access tokens will be cached at any one time.

The access token's cache per username remains idle before expiring.
Its expiration time depends response lifetime, and 
expiration time is updated after looking up the cache.
Currently response lifetime is 1 minute.

Since I want to access Active Directory frequently, 
I make this response lifetime to be input-able for users.



--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CONNECTORS-257) Input-able response lifetime for Active Directory authority

2011-09-19 Thread Shinichiro Abe (JIRA)

 [ 
https://issues.apache.org/jira/browse/CONNECTORS-257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shinichiro Abe updated CONNECTORS-257:
--

Description: 
The access tokens are cached for one minute, and up to 1000 different
users' access tokens will be cached at any one time.

The access token's cache per username remains idle before expiring.
Its expiration time depends response lifetime, and 
expiration time is updated after looking up the cache.
Currently response lifetime is 1 minute.

Since I want not to access Active Directory frequently, 
I make this response lifetime to be input-able for users.



  was:
The access tokens are cached for one minute, and up to 1000 different
users' access tokens will be cached at any one time.

The access token's cache per username remains idle before expiring.
Its expiration time depends response lifetime, and 
expiration time is updated after looking up the cache.
Currently response lifetime is 1 minute.

Since I want to access Active Directory frequently, 
I make this response lifetime to be input-able for users.




> Input-able response lifetime for Active Directory authority
> ---
>
> Key: CONNECTORS-257
> URL: https://issues.apache.org/jira/browse/CONNECTORS-257
> Project: ManifoldCF
>  Issue Type: Improvement
>  Components: Active Directory authority
>Affects Versions: ManifoldCF 0.1, ManifoldCF 0.2, ManifoldCF 0.3
>Reporter: Shinichiro Abe
>Assignee: Shinichiro Abe
>Priority: Minor
> Fix For: ManifoldCF 0.4
>
> Attachments: CONNECTORS-257-1.patch
>
>
> The access tokens are cached for one minute, and up to 1000 different
> users' access tokens will be cached at any one time.
> The access token's cache per username remains idle before expiring.
> Its expiration time depends response lifetime, and 
> expiration time is updated after looking up the cache.
> Currently response lifetime is 1 minute.
> Since I want not to access Active Directory frequently, 
> I make this response lifetime to be input-able for users.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira