Re: [VOTE] Release Apache ManifoldCF 0.1 Incubating, RC7

2011-01-10 Thread Jack Krupansky

+1 (but non-committer)

-- Jack Krupansky

-Original Message- 
From: Karl Wright 
Sent: Monday, January 10, 2011 8:40 AM 
To: connectors-dev@incubator.apache.org 
Subject: Re: [VOTE] Release Apache ManifoldCF 0.1 Incubating, RC7 


+1 for me.
Karl

On Mon, Jan 10, 2011 at 8:39 AM, Karl Wright daddy...@gmail.com wrote:

Vote for/against RC7 on this thread, please.
Karl



Re: [VOTE] Release Apache ManifoldCF 0.1 Incubating, RC6

2011-01-08 Thread Jack Krupansky

+1 (but  a non-committer)

-- Jack Krupansky

-Original Message- 
From: Karl Wright

Sent: Saturday, January 08, 2011 7:10 PM
To: connectors-dev
Subject: Re: [VOTE] Release Apache ManifoldCF 0.1 Incubating, RC6

+1 from me.
Karl

On Sat, Jan 8, 2011 at 7:05 PM, Karl Wright daddy...@gmail.com wrote:
This is a formal vote, on release candidate RC6 for the ManifoldCF 0.1 
release.


I was forced to make changes to the originally voted RC3 at the behest
of the incubator - specifically, the way the release was packaged.
After the incubator got through with cutting stuff out, Grant asked me
to put stuff back.  As a result, there are two archives, a -src
archive and a -bin, which is really binary+sources.

I believe everybody's concerns in the incubator have been dealt with
in the RC6 release candidate.  I have uploaded it to:

http://people.apache.org/~kwright/apache-manifoldcf-0.1-incubating

I've also tagged this release candidate in SVN, at:

http://svn.apache.org/repos/asf/incubator/lcf/tags/release-0.1-incubating-RC6

I wish to hold this vote open for as little time as possible.  We need
at least 3 +1's from committers to once again approach the incubator
with this (hopefully final) candidate.

Thanks very much for your time!
Karl





Re: [RESULT][VOTE] Release ManifoldCF 0.1?

2011-01-04 Thread Jack Krupansky

+ 1 from me (although I still haven't found time to do testing with Solr.)

-- Jack Krupansky

-Original Message- 
From: Simon Willnauer

Sent: Tuesday, January 04, 2011 9:11 AM
To: connectors-dev@incubator.apache.org
Subject: Re: [RESULT][VOTE] Release ManifoldCF 0.1?

I checked out all archives, verified the checksum  signatures. Looks all 
good.

All tests passed for me after downloading and extracting the sources.

I think we are good to go.

here is my +1 - congratulations karl!

simon

PS: seems like we should have another keysigning party  - only two
keys there and karls is only signed by grant.

On Tue, Jan 4, 2011 at 3:03 AM, Karl Wright daddy...@gmail.com wrote:

+1 from me.
Karl

On Mon, Jan 3, 2011 at 3:20 PM, Grant Ingersoll gsing...@apache.org 
wrote:
Sorry for the long delay.  I think we are in pretty good shape as far as 
the legal bits go.  I also did an ant test and went over the signatures, 
LICENSE.txt, NOTICE.txt and checked out the license headers via RAT.


For a 0.1 release, here's my +1.

For the record, Karl, you can vote and it is binding.  Other PPMC members 
are the committers, so they should check it out and vote too.


-Grant

On Jan 3, 2011, at 6:11 AM, Karl Wright wrote:


Any news on this front?
Karl

On Sun, Dec 26, 2010 at 6:27 PM, Karl Wright daddy...@gmail.com wrote:

I hope we can scrape together two more votes.  Who else is on the PPMC
for ManifoldCF?  That's never been clear to me.

Karl

On Fri, Dec 24, 2010 at 4:04 PM, Grant Ingersoll gsing...@apache.org 
wrote:
Yeah, sorry.  It is the holidays for me.  I hope to look at it on 
Monday.  For the record, we need 3 votes from the PPMC and then I 
think we need to go to the Incubator PMC and vote there, but I will 
read up on it.


On Dec 24, 2010, at 11:17 AM, Jack Krupansky wrote:

It's most likely the holidays. Too much of a mad rush, either to 
leave early or to get real work done to try to leave early.


Sorry I wasn't able to find time to try out the RC. Hopefully... next 
week will be slow and I'll have better luck finding a few minutes of 
quiet time.


-- Jack Krupansky

-Original Message- From: Karl Wright
Sent: Friday, December 24, 2010 11:09 AM
To: connectors-dev
Subject: [RESULT][VOTE] Release ManifoldCF 0.1?

There were zero votes in favor, and zero against.

I feel little uncomfortable voting myself, since I put together the
release.  Therefore, the vote effectively fails.

Karl

On Sun, Dec 19, 2010 at 7:33 PM, Karl Wright daddy...@gmail.com 
wrote:

+1 if you think ManifoldCF 0.1 is ready for release, -1 if not. (If
this vote passes, I believe we will still need to hold a vote in the
incubator general list.)
Thanks,
Karl




--
Grant Ingersoll
http://www.lucidimagination.com






--
Grant Ingersoll
http://www.lucidimagination.com








Re: [RESULT][VOTE] Release ManifoldCF 0.1?

2010-12-24 Thread Jack Krupansky
It's most likely the holidays. Too much of a mad rush, either to leave early 
or to get real work done to try to leave early.


Sorry I wasn't able to find time to try out the RC. Hopefully... next week 
will be slow and I'll have better luck finding a few minutes of quiet time.


-- Jack Krupansky

-Original Message- 
From: Karl Wright

Sent: Friday, December 24, 2010 11:09 AM
To: connectors-dev
Subject: [RESULT][VOTE] Release ManifoldCF 0.1?

There were zero votes in favor, and zero against.

I feel little uncomfortable voting myself, since I put together the
release.  Therefore, the vote effectively fails.

Karl

On Sun, Dec 19, 2010 at 7:33 PM, Karl Wright daddy...@gmail.com wrote:

+1 if you think ManifoldCF 0.1 is ready for release, -1 if not. (If
this vote passes, I believe we will still need to hold a vote in the
incubator general list.)
Thanks,
Karl





Re: [VOTE] Release ManifoldCF 0.1?

2010-12-19 Thread Jack Krupansky
I'm inclined to go ahead with a +1, but maybe it would be advisable to give 
the RC a couple of days. (And maybe I'll finally get around to trying it out 
as well.) If nobody objects by Wednesday/Thursday - and there are some 
+1's - go ahead and say its done.


-- Jack Krupansky

-Original Message- 
From: Karl Wright

Sent: Sunday, December 19, 2010 7:33 PM
To: connectors-dev
Subject: [VOTE] Release ManifoldCF 0.1?

+1 if you think ManifoldCF 0.1 is ready for release, -1 if not. (If
this vote passes, I believe we will still need to hold a vote in the
incubator general list.)
Thanks,
Karl 



Re: Release?

2010-12-01 Thread Jack Krupansky

+1

Unfortunately I am maxed out until at least Friday, so there has been no 
chance for me to get to look at it. That said, I won't hold it up. Besides, 
0.1 is mostly testing the process anyway, so we can fix issues in 0.2 as 
well. So, I say go for it, unless somebody really objects.


-- Jack Krupansky

-Original Message- 
From: Karl Wright

Sent: Wednesday, December 01, 2010 11:47 AM
To: connectors-dev@incubator.apache.org
Subject: Re: Release?

Should I just call the vote?  It's been a week...
Karl

On Mon, Nov 29, 2010 at 1:18 PM, Karl Wright daddy...@gmail.com wrote:

Great!
Has anyone else had a chance to look at RC1 yet?  If not, should I
offer gift certificates or something to encourage participation? ;-)

Karl


On Sat, Nov 27, 2010 at 7:52 AM, Grant Ingersoll gsing...@apache.org 
wrote:
I'll take a look, but it won't likely be until Tuesday (extended Turkey 
going on here!)


On Nov 24, 2010, at 8:39 AM, Karl Wright wrote:


Uploaded RC1.
Karl

On Wed, Nov 24, 2010 at 7:04 AM, Karl Wright daddy...@gmail.com wrote:

A problem with the FileNet connector has caused me to build an RC1.
It's uploading now.

Karl

On Tue, Nov 23, 2010 at 1:12 PM, Jack Krupansky
jack.krupan...@lucidimagination.com wrote:
That's a great leap forward... RC0 of ManifoldCF 0.1! That's a lot of 
the

hardest of the work.

I'm busy on some other things right now, but maybe next week I can 
take a

look.

-- Jack Krupansky

-Original Message- From: Karl Wright
Sent: Tuesday, November 23, 2010 1:00 PM
To: connectors-dev@incubator.apache.org
Subject: Re: Release?

While I was looking for a solution, an upload attempt succeeded!

So there is now an RC0 out on people.apache.org/~kwright:

[kwri...@minotaur:~]$ ls -lt manifoldcf-0.1.*
-rw-r--r--  1 kwright  kwright 63 Nov 23 17:57
manifoldcf-0.1.tar.gz.md5
-rw-r--r--  1 kwright  kwright 60 Nov 23 17:57
manifoldcf-0.1.zip.md5
-rw-r--r--  1 kwright  kwright  158734230 Nov 23 17:55 
manifoldcf-0.1.zip
-rw-r--r--  1 kwright  kwright  156742315 Nov 23 17:06 
manifoldcf-0.1.tar.gz

[kwri...@minotaur:~]$

Please let me know what you think.
Karl


On Tue, Nov 23, 2010 at 11:25 AM, Karl Wright daddy...@gmail.com 
wrote:


The upload has failed repeatedly for me, so I'll clearly have to find
another way.
Karl

On Tue, Nov 23, 2010 at 10:47 AM, Karl Wright daddy...@gmail.com 
wrote:


I'm uploading a release candidate now.  But someone needs to feed 
the
hamsters turning the wheels or something, because the upload speed 
to

that machine is 51KB/sec, so it's going to take 3 hours to get the
candidate up there, if my network connection doesn't bounce in the
interim.  Is there any other place available?

Karl

On Fri, Nov 19, 2010 at 8:34 AM, Grant Ingersoll 
gsing...@apache.org

wrote:


On Nov 19, 2010, at 6:18 AM, Karl Wright wrote:


I've created a signing key, and checked in a KEYS file.  Apache
instructions for this are actually decent, so I didn't have to 
make

much stuff up.  Glad about that.



Yep, sorry, have been in meetings.


Last remaining release issue is getting the release files to a
download mirror.  Maybe I can find some doc for that too.



Next steps would be to generate a candidate release which the rest 
of us
can download.  Put it up on people.apache.org/~YOURUSERNAME/... and 
then
send a note to the list saying where to locate it.  Rather than 
call a vote
right away, just ask us to check it out and try it as there will 
likely be
issues for the first release.  Once we all feel we have a decent 
candidate,

we can call a vote, which should be a formality.

See http://apache.org/dev/#releases for more info.





Karl

On Fri, Nov 19, 2010 at 4:13 AM, Karl Wright daddy...@gmail.com
wrote:


The build changes are complete.  I removed the modules level from 
the
hierarchy because it served no useful purpose and complicated 
matters.
 The outer level build.xml now allows you build code, docs, and 
run
tests separately from one another, and gives you help as a 
default.
ant image builds you the deliverable .zip and tar.gz files. 
Online
site has been polished so that it now contains complete javadoc, 
as
does the built and delivered .zip and tar.gz's.  In short,  we 
*could*
actually do a release now, if only we had (and incorporated) the 
KEYS
file I alluded to earlier, which I do not know how to build or 
obtain.
 I believe this needs to be both generated and registered.  The 
site
also needs to refer to a download location/list of mirrors before 
it

could go out the door.

Help? Grant?

Karl

On Wed, Nov 17, 2010 at 9:50 PM, Karl Wright daddy...@gmail.com
wrote:


Hearing nothing, went ahead and made the port of documentation 
to the
site official.  I also now include the generated site in the 
release

tar.gz and .zip.
Issues still to address before release:

(1) source tar.gz and zip in outer-level build.xml, which I will 
try

to address shortly.
(2) vehicle for release downloads, and naming thereof.  In 
short,

where do I put

Re: Release?

2010-11-15 Thread Jack Krupansky
I didn't mean to imply that the wiki needs to be physically included in the 
release zip/tar, just that snapshotting and versioning of the wiki should be 
done, if feasible, so that a user who is on an older release can still see 
the doc for that release. I am just thinking ahead for future releases. So, 
0.1 does not need this right now.


-- Jack Krupansky

-Original Message- 
From: Grant Ingersoll

Sent: Monday, November 15, 2010 10:23 AM
To: connectors-dev@incubator.apache.org
Subject: Re: Release?


On Nov 10, 2010, at 1:22 AM, Jack Krupansky wrote:

And the wiki doc is also part of the release. Does this stuff get a 
version/release as well? Presumably we want doc for currently supported 
releases, and the doc can vary between releases. Can we easily snapshot 
the wiki?


You can't put Wiki in a release, as their is no way to track whether the 
person has permission to donate it..




Will we have nightly builds in place? I think a 0.1 can get released 
without a nightly build, but it would be nice to say that we also have a 
rolling trunk release which is just the latest build off trunk and the 
latest wiki/doc as well. So, some people may want the official 0.1, but 
others may want to run straight from trunk/nightly build.


-- Jack Krupansky

-Original Message- From: Karl Wright
Sent: Tuesday, November 09, 2010 1:56 PM
To: connectors-dev@incubator.apache.org
Subject: Re: Release?

Proposal:  Release to consist of two things: tar and zip of a complete
source tree, and tar and zip of the modules/dist area after the build.
The implied way people are to work with this is:

- to use just the distribution, untar or unzip the distribution
zip/tar into a work area, and either use the multiprocess version, or
the quickstart example.
- to add a connector, untar or unzip the source zip/tar into a work
area, and integrate your connector into the build.

Is this acceptable for a 0.1 release?

Karl

On Tue, Nov 9, 2010 at 10:22 AM, Jack Krupansky
jack.krupan...@lucidimagination.com wrote:
Oh, I wasn't intending to disparage the RSS or other connectors, just 
giving

my own priority list of must haves. By all means, the well-supported
connector list should be whatever list you want to feel is appropriate 
and

exclude only those where we feel that we would not be able to provide
sufficient support and assistance online.

That's great that qBase is offering access.

BTW, I was just thinking that maybe we should try to keep logs of each
connector type in action so that people have a reference to consult when
debugging their own connector-related problems. In other words, what a
successful connection session is supposed to look like. So, have a test 
and

its reference log.

-- Jack Krupansky

-Original Message- From: Karl Wright
Sent: Tuesday, November 09, 2010 9:46 AM
To: connectors-dev@incubator.apache.org
Subject: Re: Release?

If you can claim well supported for the web connector, you certainly
should be able to claim it for the RSS connector.  You could also
reasonably include the JDBC connector because it does not require a
proprietary system to test.

But if your definition is that tests exist for all the well
supported ones, somebody has some work to do.  I'd like to see a plan
on how we get from where we are now to a more comprehensive set of
tests.  I've gotten qBase to agree to let me have access to their Q/A
infrastructure (which used to be MetaCarta's), but that's only going
to be helpful for diagnosing problems and doing development, not for
automated tests that anyone can run.

Karl

On Tue, Nov 9, 2010 at 9:38 AM, Jack Krupansky
jack.krupan...@lucidimagination.com wrote:


And one of the issues on the list should be to define the 
well-supported

connectors for 0.5 (or whatever) as opposed to the code is there and
thought to work, you are on your own for testing/support connectors.
Longer
term, we should get most/all connectors into the well-supported
category,
but I wouldn't use that as the bar for even 1.0.

My personal minimum well-supported connector list for a 0.5 would be
file
system, web, and SharePoint*.

* Oh... there is the issue of SharePoint 2010 or whatever the latest is,
but
current MCF support should be good enough for a 0.5 release, I think.

(Got to keep up with Google Connectors!)

-- Jack Krupansky

-Original Message- From: Karl Wright
Sent: Tuesday, November 09, 2010 9:28 AM
To: connectors-dev@incubator.apache.org
Subject: Re: Release?

I'm in favor of a release.  I'm not sure, though, what the release
parameters ought to be.  I think the minimum is that we need to build
a release infrastructure and plan, set up a release process, and
decide what the release packaging should look like (zip's, tar's,
sources, deliverables) and where the javadoc will be published online.
(It's possible that we may, for instance, decide to change the way
the ant build scripts work to make it easier for people to build the
proprietary connectors after the fact

Re: Release?

2010-11-09 Thread Jack Krupansky
At least get a release 0.1 dry-run with code as-is out ASAP to flush out 
release process issues. This would help to send out a message to the rest of 
the world that MCF is an available product rather than purely 
development/incubation.


Then come up with a list of issues that people strongly feel need to be 
resolved before a true, squeaky-clean 1.0 release. Maybe that is the 
original list of tasks, including better testing, but some review/decisions 
are probably needed. That will be the ultimate target.


Then decide on a close enough subset of issues that would constitute what 
people consider a solid beta and target that as a release 0.5 and focus on 
that as the near-term target (after getting 0.1 out ASAP.) I personally do 
not have any major issues on the top of my head that I would hold out as 
blockers for a 0.5.


Or, get 0.1 out and then move on to a 0.2, etc. on a monthly/bi-monthly 
basis as progress is made.


In short, get MCF as-is 0.1 out ASAP, have a very short list for MCF 0.5 to 
get it out reasonably soon, and then revisit what 1.0 really means versus 
0.6, etc.


-- Jack Krupansky

-Original Message- 
From: Grant Ingersoll

Sent: Tuesday, November 09, 2010 8:38 AM
To: connectors-dev@incubator.apache.org
Subject: Release?

Now that we have NTLM figured out and the Memex stuff behind us, how do 
people feel about working towards a release?


-Grant 



Re: Release?

2010-11-09 Thread Jack Krupansky
And one of the issues on the list should be to define the well-supported 
connectors for 0.5 (or whatever) as opposed to the code is there and 
thought to work, you are on your own for testing/support connectors. Longer 
term, we should get most/all connectors into the well-supported category, 
but I wouldn't use that as the bar for even 1.0.


My personal minimum well-supported connector list for a 0.5 would be file 
system, web, and SharePoint*.


* Oh... there is the issue of SharePoint 2010 or whatever the latest is, but 
current MCF support should be good enough for a 0.5 release, I think.


(Got to keep up with Google Connectors!)

-- Jack Krupansky

-Original Message- 
From: Karl Wright

Sent: Tuesday, November 09, 2010 9:28 AM
To: connectors-dev@incubator.apache.org
Subject: Re: Release?

I'm in favor of a release.  I'm not sure, though, what the release
parameters ought to be.  I think the minimum is that we need to build
a release infrastructure and plan, set up a release process, and
decide what the release packaging should look like (zip's, tar's,
sources, deliverables) and where the javadoc will be published online.
(It's possible that we may, for instance, decide to change the way
the ant build scripts work to make it easier for people to build the
proprietary connectors after the fact, for instance.  Or we could
claim that the release is just the sources, either way.)

After that, we need to figure out what tickets we still want done
before the release occurs.  I'd argue for more testing, and I'm also
trying to figure out issues pertaining to Documentum and FileNet,
because these connectors require sidecar processes that are not well
supported in the example.  We could go substantially beyond that, but
I agree with Jack that 0.1 would be useful if we only get that far.

Thoughts?
Karl



On Tue, Nov 9, 2010 at 8:58 AM, Jack Krupansky
jack.krupan...@lucidimagination.com wrote:

At least get a release 0.1 dry-run with code as-is out ASAP to flush out
release process issues. This would help to send out a message to the rest 
of

the world that MCF is an available product rather than purely
development/incubation.

Then come up with a list of issues that people strongly feel need to be
resolved before a true, squeaky-clean 1.0 release. Maybe that is the
original list of tasks, including better testing, but some 
review/decisions

are probably needed. That will be the ultimate target.

Then decide on a close enough subset of issues that would constitute 
what
people consider a solid beta and target that as a release 0.5 and focus 
on

that as the near-term target (after getting 0.1 out ASAP.) I personally do
not have any major issues on the top of my head that I would hold out as
blockers for a 0.5.

Or, get 0.1 out and then move on to a 0.2, etc. on a monthly/bi-monthly
basis as progress is made.

In short, get MCF as-is 0.1 out ASAP, have a very short list for MCF 0.5 
to

get it out reasonably soon, and then revisit what 1.0 really means versus
0.6, etc.

-- Jack Krupansky

-Original Message- From: Grant Ingersoll
Sent: Tuesday, November 09, 2010 8:38 AM
To: connectors-dev@incubator.apache.org
Subject: Release?

Now that we have NTLM figured out and the Memex stuff behind us, how do
people feel about working towards a release?

-Grant





Re: Release?

2010-11-09 Thread Jack Krupansky
And the wiki doc is also part of the release. Does this stuff get a 
version/release as well? Presumably we want doc for currently supported 
releases, and the doc can vary between releases. Can we easily snapshot the 
wiki?


Will we have nightly builds in place? I think a 0.1 can get released without 
a nightly build, but it would be nice to say that we also have a rolling 
trunk release which is just the latest build off trunk and the latest 
wiki/doc as well. So, some people may want the official 0.1, but others may 
want to run straight from trunk/nightly build.


-- Jack Krupansky

-Original Message- 
From: Karl Wright

Sent: Tuesday, November 09, 2010 1:56 PM
To: connectors-dev@incubator.apache.org
Subject: Re: Release?

Proposal:  Release to consist of two things: tar and zip of a complete
source tree, and tar and zip of the modules/dist area after the build.
The implied way people are to work with this is:

- to use just the distribution, untar or unzip the distribution
zip/tar into a work area, and either use the multiprocess version, or
the quickstart example.
- to add a connector, untar or unzip the source zip/tar into a work
area, and integrate your connector into the build.

Is this acceptable for a 0.1 release?

Karl

On Tue, Nov 9, 2010 at 10:22 AM, Jack Krupansky
jack.krupan...@lucidimagination.com wrote:
Oh, I wasn't intending to disparage the RSS or other connectors, just 
giving

my own priority list of must haves. By all means, the well-supported
connector list should be whatever list you want to feel is appropriate and
exclude only those where we feel that we would not be able to provide
sufficient support and assistance online.

That's great that qBase is offering access.

BTW, I was just thinking that maybe we should try to keep logs of each
connector type in action so that people have a reference to consult when
debugging their own connector-related problems. In other words, what a
successful connection session is supposed to look like. So, have a test 
and

its reference log.

-- Jack Krupansky

-Original Message- From: Karl Wright
Sent: Tuesday, November 09, 2010 9:46 AM
To: connectors-dev@incubator.apache.org
Subject: Re: Release?

If you can claim well supported for the web connector, you certainly
should be able to claim it for the RSS connector.  You could also
reasonably include the JDBC connector because it does not require a
proprietary system to test.

But if your definition is that tests exist for all the well
supported ones, somebody has some work to do.  I'd like to see a plan
on how we get from where we are now to a more comprehensive set of
tests.  I've gotten qBase to agree to let me have access to their Q/A
infrastructure (which used to be MetaCarta's), but that's only going
to be helpful for diagnosing problems and doing development, not for
automated tests that anyone can run.

Karl

On Tue, Nov 9, 2010 at 9:38 AM, Jack Krupansky
jack.krupan...@lucidimagination.com wrote:


And one of the issues on the list should be to define the 
well-supported

connectors for 0.5 (or whatever) as opposed to the code is there and
thought to work, you are on your own for testing/support connectors.
Longer
term, we should get most/all connectors into the well-supported
category,
but I wouldn't use that as the bar for even 1.0.

My personal minimum well-supported connector list for a 0.5 would be
file
system, web, and SharePoint*.

* Oh... there is the issue of SharePoint 2010 or whatever the latest is,
but
current MCF support should be good enough for a 0.5 release, I think.

(Got to keep up with Google Connectors!)

-- Jack Krupansky

-Original Message- From: Karl Wright
Sent: Tuesday, November 09, 2010 9:28 AM
To: connectors-dev@incubator.apache.org
Subject: Re: Release?

I'm in favor of a release.  I'm not sure, though, what the release
parameters ought to be.  I think the minimum is that we need to build
a release infrastructure and plan, set up a release process, and
decide what the release packaging should look like (zip's, tar's,
sources, deliverables) and where the javadoc will be published online.
(It's possible that we may, for instance, decide to change the way
the ant build scripts work to make it easier for people to build the
proprietary connectors after the fact, for instance.  Or we could
claim that the release is just the sources, either way.)

After that, we need to figure out what tickets we still want done
before the release occurs.  I'd argue for more testing, and I'm also
trying to figure out issues pertaining to Documentum and FileNet,
because these connectors require sidecar processes that are not well
supported in the example.  We could go substantially beyond that, but
I agree with Jack that 0.1 would be useful if we only get that far.

Thoughts?
Karl



On Tue, Nov 9, 2010 at 8:58 AM, Jack Krupansky
jack.krupan...@lucidimagination.com wrote:


At least get a release 0.1 dry-run with code as-is out ASAP to flush out
release

Re: Naming status?

2010-10-17 Thread Jack Krupansky
Good enough for me. So I'll consider this deal closed -- by the community, 
and that the name Apache ManifoldCF is official or as official as it can 
get.


-- Jack Krupansky

--
From: Grant Ingersoll gsing...@apache.org
Sent: Sunday, October 17, 2010 9:17 AM
To: connectors-dev@incubator.apache.org
Subject: Re: Naming status?

I meant to say, I think we met the concerns of others.  I don't think we 
need to go rehash it again.



On Oct 16, 2010, at 7:15 AM, Jack Krupansky wrote:

I'd prefer that Grant close the deal on the name since he is the one 
with clout at Apache, but you could do it as well, I suppose - I don't 
know how these things really work yet.


-- Jack Krupansky

--
From: Karl Wright daddy...@gmail.com
Sent: Saturday, October 16, 2010 7:04 AM
To: connectors-dev@incubator.apache.org
Subject: Re: Naming status?


The vote made it official, as far as I can tell.  If you are referring
to incubator-general, someone probably should post, yes, but given
that the name originally came from there I think that's a formality.
Do you or Grant want to do this?  Or shall I?

Changing the wiki name requires a ticket to be entered for
INFRASTRUCTURE.  No other problem with doing it.
Changing the repository name will yank any changed workareas out from
under the committers, and people who have checkouts.  I believe you
can fix this with svn switch.  On the other hand, I'd prefer to
finish off a few things first, so I don't risk losing them.
Changing mailing lists and JIRA base name is much more problematic
since we'd lose tickets and history, I think.  So I'd prefer to leave
those alone if we can.

Karl


On Sat, Oct 16, 2010 at 6:57 AM, Jack Krupansky
jack.krupan...@lucidimagination.com wrote:
Where are we as far as making the ManifoldCF name official? Or is it 
merely a matter of posting it on the general list and seeing if anybody 
has any major objection?


Some other name references that presumably simply await making the name 
official:


1. The repository is still lcf.
2. The wiki top level name is Apache Connectors Framework on the Apache 
Dashboard.

3. The mailing list names are connectors-.
4. The Jira name is CONNECTORS.

Are there any risks with changing the latter two?

-- Jack Krupansky




--
Grant Ingersoll
http://www.lucidimagination.com



Re: Naming status?

2010-10-16 Thread Jack Krupansky
I'd prefer that Grant close the deal on the name since he is the one with 
clout at Apache, but you could do it as well, I suppose - I don't know how 
these things really work yet.


-- Jack Krupansky

--
From: Karl Wright daddy...@gmail.com
Sent: Saturday, October 16, 2010 7:04 AM
To: connectors-dev@incubator.apache.org
Subject: Re: Naming status?


The vote made it official, as far as I can tell.  If you are referring
to incubator-general, someone probably should post, yes, but given
that the name originally came from there I think that's a formality.
Do you or Grant want to do this?  Or shall I?

Changing the wiki name requires a ticket to be entered for
INFRASTRUCTURE.  No other problem with doing it.
Changing the repository name will yank any changed workareas out from
under the committers, and people who have checkouts.  I believe you
can fix this with svn switch.  On the other hand, I'd prefer to
finish off a few things first, so I don't risk losing them.
Changing mailing lists and JIRA base name is much more problematic
since we'd lose tickets and history, I think.  So I'd prefer to leave
those alone if we can.

Karl


On Sat, Oct 16, 2010 at 6:57 AM, Jack Krupansky
jack.krupan...@lucidimagination.com wrote:
Where are we as far as making the ManifoldCF name official? Or is it 
merely a matter of posting it on the general list and seeing if anybody 
has any major objection?


Some other name references that presumably simply await making the name 
official:


1. The repository is still lcf.
2. The wiki top level name is Apache Connectors Framework on the Apache 
Dashboard.

3. The mailing list names are connectors-.
4. The Jira name is CONNECTORS.

Are there any risks with changing the latter two?

-- Jack Krupansky 




[jira] Commented: (CONNECTORS-116) Possibly remove memex connector depending upon legal resolution

2010-10-13 Thread Jack Krupansky (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12920568#action_12920568
 ] 

Jack Krupansky commented on CONNECTORS-116:
---

It would be nice to see a comment about what would be required to add Memex 
support back.

I note the following statement in the original incubation submission:

It is unlikely that EMC, OpenText, Memex, or IBM would grant 
Apache-license-compatible use of these client libraries. Thus, the expectation 
is that users of these connectors obtain the necessary client libraries from 
the owners prior to building or using the corresponding connector. An 
alternative would be to undertake a clean-room implementation of the client 
API's, which may well yield suitable results in some cases (LiveLink, Memex, 
FileNet), while being out of reach in others (Documentum). Conditional 
compilation, for the short term, is thus likely to be a necessity.

Is it only the Memex connector that now has this problem?

Do we need do a clean-room implementation for Memex? For any of the others?

FWIW, I don't see a Google Connector for Memex.


 Possibly remove memex connector depending upon legal resolution
 ---

 Key: CONNECTORS-116
 URL: https://issues.apache.org/jira/browse/CONNECTORS-116
 Project: ManifoldCF
  Issue Type: Task
  Components: Memex connector
Reporter: Robert Muir
Assignee: Robert Muir

 Apparently there is an IP problem with the memex connector code.
 Depending upon what apache legal says, we will take any action under this 
 issue publicly.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (CONNECTORS-118) Crawled archive files should be expanded into their constituent files

2010-10-13 Thread Jack Krupansky (JIRA)
Crawled archive files should be expanded into their constituent files
-

 Key: CONNECTORS-118
 URL: https://issues.apache.org/jira/browse/CONNECTORS-118
 Project: ManifoldCF
  Issue Type: New Feature
  Components: Framework crawler agent
Reporter: Jack Krupansky


Archive files such as zip, mbox, tar, etc. should be expanded into their 
constituent files during crawling of repositories so that any output connector 
would output the flattened archive.

This could be an option, defaulted to ON, since someone may want to implement a 
copy connector that maintains crawled files as-is.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (CONNECTORS-118) Crawled archive files should be expanded into their constituent files

2010-10-13 Thread Jack Krupansky (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12920730#action_12920730
 ] 

Jack Krupansky commented on CONNECTORS-118:
---

Support within the file system connector is obviously the higher priority. 
Windows shares as well. And FTP/SFTP.


 Crawled archive files should be expanded into their constituent files
 -

 Key: CONNECTORS-118
 URL: https://issues.apache.org/jira/browse/CONNECTORS-118
 Project: ManifoldCF
  Issue Type: New Feature
  Components: Framework crawler agent
Reporter: Jack Krupansky

 Archive files such as zip, mbox, tar, etc. should be expanded into their 
 constituent files during crawling of repositories so that any output 
 connector would output the flattened archive.
 This could be an option, defaulted to ON, since someone may want to implement 
 a copy connector that maintains crawled files as-is.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (CONNECTORS-118) Crawled archive files should be expanded into their constituent files

2010-10-13 Thread Jack Krupansky (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12920787#action_12920787
 ] 

Jack Krupansky commented on CONNECTORS-118:
---

Aperture's approach was just a starting point for discussion for how to form an 
id for a file in an archive file. As long as the MCF rules are functionally 
equivalent to the Apache VFS rules, we should be okay.

In short, my proposal does not have a requirement for what an id should look 
like, just a suggestion.


 Crawled archive files should be expanded into their constituent files
 -

 Key: CONNECTORS-118
 URL: https://issues.apache.org/jira/browse/CONNECTORS-118
 Project: ManifoldCF
  Issue Type: New Feature
  Components: Framework crawler agent
Reporter: Jack Krupansky

 Archive files such as zip, mbox, tar, etc. should be expanded into their 
 constituent files during crawling of repositories so that any output 
 connector would output the flattened archive.
 This could be an option, defaulted to ON, since someone may want to implement 
 a copy connector that maintains crawled files as-is.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (CONNECTORS-118) Crawled archive files should be expanded into their constituent files

2010-10-13 Thread Jack Krupansky (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12920801#action_12920801
 ] 

Jack Krupansky commented on CONNECTORS-118:
---

One of those VFS links points to all the Java packages used to access the list 
of archive formats I listed. I have personally written unit tests that 
generated most of those formats which Aperture then extracted.


 Crawled archive files should be expanded into their constituent files
 -

 Key: CONNECTORS-118
 URL: https://issues.apache.org/jira/browse/CONNECTORS-118
 Project: ManifoldCF
  Issue Type: New Feature
  Components: Framework crawler agent
Reporter: Jack Krupansky

 Archive files such as zip, mbox, tar, etc. should be expanded into their 
 constituent files during crawling of repositories so that any output 
 connector would output the flattened archive.
 This could be an option, defaulted to ON, since someone may want to implement 
 a copy connector that maintains crawled files as-is.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Issue Comment Edited: (CONNECTORS-118) Crawled archive files should be expanded into their constituent files

2010-10-13 Thread Jack Krupansky (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12920801#action_12920801
 ] 

Jack Krupansky edited comment on CONNECTORS-118 at 10/13/10 7:35 PM:
-

I have personally written unit tests that generated most of those formats which 
Aperture then extracted.

See:
http://sourceforge.net/apps/trac/aperture/wiki/SubCrawlers

org.apache.tools.bzip2 - BZIP2 archives.
java.util.zip.GZIPInputStream - GZIP archives.
javax.mail   - message/rfc822-style messages and mbox files.
org.apache.tools.tar - tar archives.



  was (Author: jkrupan):
One of those VFS links points to all the Java packages used to access the 
list of archive formats I listed. I have personally written unit tests that 
generated most of those formats which Aperture then extracted.

  
 Crawled archive files should be expanded into their constituent files
 -

 Key: CONNECTORS-118
 URL: https://issues.apache.org/jira/browse/CONNECTORS-118
 Project: ManifoldCF
  Issue Type: New Feature
  Components: Framework crawler agent
Reporter: Jack Krupansky

 Archive files such as zip, mbox, tar, etc. should be expanded into their 
 constituent files during crawling of repositories so that any output 
 connector would output the flattened archive.
 This could be an option, defaulted to ON, since someone may want to implement 
 a copy connector that maintains crawled files as-is.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Re: [RESULT][VOTE] Rename Apache Connectors Framework to ManifoldCF

2010-10-04 Thread Jack Krupansky
That sounds good, and it is great to finally see the project naming move 
towards a final form that can stand up to even the most rigorous challenge. 
Long live ManifoldCF!


That said, we may still choose to informally refer to MCF or mcf, although 
we should of course promote the proper name, either as ManifoldCF or Apache 
ManifoldCF (or Apache Manifold Connectors Framework?), as often and widely 
as possible.


Did we ever settle whether that long-form name with CF expanded was okay for 
descriptive purposes even if the official Apache project name is Apache 
ManifoldCF?


-- Jack Krupansky

--
From: Karl Wright daddy...@gmail.com
Sent: Monday, October 04, 2010 9:56 AM
To: connectors-dev@incubator.apache.org
Subject: Re: [RESULT][VOTE] Rename Apache Connectors Framework to ManifoldCF


On reflection, I've actually decided to just use manifoldcf
everywhere, just because that's least likely to run into problems in
the long run.
Karl


On Sun, Oct 3, 2010 at 9:35 PM, Karl Wright daddy...@gmail.com wrote:

I think using mcf in the package name and the names of the webapps
will likely be fine.  I'm less worried about everything else.  Grant,
any comments?

Karl


On Sun, Oct 3, 2010 at 8:58 PM, Jack Krupansky
jack.krupan...@lucidimagination.com wrote:
I'm okay with all of that, but with a question whether we can get away 
with

using an abbreviation in org.apache.mcf as opposed to
org.apache.manifoldcf. And then, whether the graduated project would 
be at
http://mcf.apache.org; or http://manifoldcf.apache.org;. I have no 
idea
whether there might be pushback higher up on that, but my inclination is 
to
go ahead with using mcf. I'll defer to Karl as to whether he wants to 
verify

our assumption through/with Grant/et al or just go ahead.

-- Jack Krupansky

--
From: Karl Wright daddy...@gmail.com
Sent: Sunday, October 03, 2010 7:35 PM
To: connectors-dev connectors-dev@incubator.apache.org
Subject: [RESULT][VOTE] Rename Apache Connectors Framework to ManifoldCF


The vote passes.  Barely.  Total count +1.  Although I believe we
didn't hear from lots of folks that made ManifoldCF their #1 choice
last time.

So, our new name is ManifoldCF.  I'm thinking this will translate to:

org.apache.mcf
MCFException
MCF abbreviation
ManifoldCF full name
webapps mcf-crawler-ui, mcf-authority-service, mcf-api-service


... and I can begin to change the tree around probably by tomorrow
morning.  Sound okay to everyone?

Karl







Re: [VOTE] Rename Apache Connectors Framework to ManifoldCF

2010-09-29 Thread Jack Krupansky
My apologies for not vetting connex properly. I actually do recall seeing 
the sourceforge project  project now when reminded of it, but I was working 
too quickly and didn't get around to editing my list right away and forgot 
about it. And I assumed that Karl was going to vet names properly anyway. 
So... it's my fault that manicon wasn't there as the top choice when 
people voted!


In any case, I think I have lost track of where we were in the process... 
voting +1/-1 on keeping ManifoldCF vs. staying with Apache Connectors 
Framework, I think? And with other Karl and I voting on that so far (+1 
and -1)? I'll let Karl send out a proper reminder of wherever he says we 
are.


-- Jack Krupansky

--
From: Karl Wright daddy...@gmail.com
Sent: Wednesday, September 29, 2010 11:08 AM
To: connectors-dev@incubator.apache.org
Subject: Re: [VOTE] Rename Apache Connectors Framework to ManifoldCF

May I point out that we've been discussing this issue for over two months 
now?


We just went through a process of gathering names, came up with some
35 candidates, and voted to rank them.  This process just ended, our
best candidate turned out to not have been submitted properly, but we
still have some 15 other names that people legitimately selected,
ranked in order.

Prior to that, we did a previous round where we did EXACTLY the same
thing, and Apache Connectors Framework was the winning selection,
followed by ManifoldCF.  It sounds now like you are looking for yet a
third round?  Unless you claim that the candidate list was simply not
broad enough, I can see no hope of gain by doing that.

Karl

On Wed, Sep 29, 2010 at 11:01 AM, Upayavira u...@odoko.co.uk wrote:

Some while back I suggested manifolio. But that breaches the four
syllable rule :-)

How about Manifole?

I'd say rather than bursting into votes, keep the discussion going, I
suspect you'll know when you've got enough of the community behind you,
and when it is then worth wrapping the whole thing up with a vote - at
which point the vote is a mere formality.

Worth giving it the effort now, see this recent post [1] - a name is
going to stay with us all for a long time!

Upayavira

[1] http://enthusiasm.cozy.org/archives/2010/09/first-time-right

On Tue, 2010-09-28 at 20:08 -0400, Karl Wright wrote:

Actually, an abbreviation of AMCF is not bad either kinda like
that myself.  But I'm still not sure I like any of the book title
choices I've offered myself here.

Do we dare use Manifold Connectors Framework in Action?  and
describe AMCF as Manifold Connectors Framework at times?

Karl

On Tue, Sep 28, 2010 at 8:04 PM, Karl Wright daddy...@gmail.com wrote:
 If this is adopted, I'm thinking we could use it in the following 
 ways:


 Abbreviation: MCF
 Short name: ManifoldCF
 Qualified short name: Apache ManifoldCF
 Fully qualified and unabbreviated name: the Apache Manifold
 Connectors Framework

 I'm not quite sure what the world will think of that last usage, since
 it does not contain the trademark.  Then again, neither does the
 abbreviation.  But I'm not sure I'd dare make the book title be
 Apache Manifold Connectors Framework in Action.  It would probably
 need to be Apache ManifoldCF in Action, or just ManifoldCF in
 Action.

 Grant, you wrote a book.  What do you think?  Which title should be 
 used?


 Karl



 On Tue, Sep 28, 2010 at 7:30 PM, Jack Krupansky
 jack.krupan...@lucidimagination.com wrote:
 -1 for me. Standing alone it's an okay name, but trying to actually 
 use it
 is a pain (and we might as well call it MCF). But I'll certainly go 
 along

 with the majority.

 -- Jack Krupansky

 --
 From: Karl Wright daddy...@gmail.com
 Sent: Tuesday, September 28, 2010 7:25 PM
 To: connectors-dev@incubator.apache.org
 Subject: Re: [VOTE] Rename Apache Connectors Framework to ManifoldCF

 Ok, I just want an up-or-down vote on ManifoldCF at this point.  +1 
 from

 me.

 Karl

 On Tue, Sep 28, 2010 at 7:22 PM, Mark Miller markrmil...@gmail.com
 wrote:

 On 9/28/10 7:10 PM, Jack Krupansky wrote:

 Fair enough. I could live with any of the other choices, but 
 having this
 CF suffix really messes a lot of stuff up and is less practical 
 than
 any of the other names. Basically, it means we may end up having 
 to use

 MCF as the shorthand name.

 Wait... stop the presses... I just realized that ManifoldCF 
 violates

 selection rule #5:

 (5) No more than 4 syllables

 Man-I-fold-C-F (or is in Ma-ni-fold-C-F.)

 That's five syllables.

 ManifoldCF was already in the running. And its obvious that having 
 too
 many syllables is not a problem - it was the second most voted 
 name -

 for the *second* time at least (who can track all these votes).


 And, technically, I would say that it at least half violates the 
 spirit

 of rule #1:

 (1) It's a single word

 It is a single word plus this extra CF acronym thing.

 That's a stretch that the rational part

Re: [VOTE] Rename Apache Connectors Framework to ManifoldCF

2010-09-29 Thread Jack Krupansky
Ah, okay, that's cool. So if the vote fails (= 0 or  0?), we would then 
vote on the next choice, which is... Manicon.


-- Jack Krupansky

--
From: Karl Wright daddy...@gmail.com
Sent: Wednesday, September 29, 2010 12:01 PM
To: connectors-dev@incubator.apache.org
Subject: Re: [VOTE] Rename Apache Connectors Framework to ManifoldCF


We're trying for an up/down confirmation of ManifoldCF as a new name
for the project.  If it succeeds, that's our name.  If it fails, it's
on to the next-highest-ranking choice.  Right now score is 0.

Karl

On Wed, Sep 29, 2010 at 11:34 AM, Jack Krupansky
jack.krupan...@lucidimagination.com wrote:

My apologies for not vetting connex properly. I actually do recall seeing
the sourceforge project  project now when reminded of it, but I was 
working
too quickly and didn't get around to editing my list right away and 
forgot
about it. And I assumed that Karl was going to vet names properly 
anyway.

So... it's my fault that manicon wasn't there as the top choice when
people voted!

In any case, I think I have lost track of where we were in the process...
voting +1/-1 on keeping ManifoldCF vs. staying with Apache Connectors
Framework, I think? And with other Karl and I voting on that so far (+1 
and

-1)? I'll let Karl send out a proper reminder of wherever he says we are.

-- Jack Krupansky

--
From: Karl Wright daddy...@gmail.com
Sent: Wednesday, September 29, 2010 11:08 AM
To: connectors-dev@incubator.apache.org
Subject: Re: [VOTE] Rename Apache Connectors Framework to ManifoldCF

May I point out that we've been discussing this issue for over two 
months

now?

We just went through a process of gathering names, came up with some
35 candidates, and voted to rank them.  This process just ended, our
best candidate turned out to not have been submitted properly, but we
still have some 15 other names that people legitimately selected,
ranked in order.

Prior to that, we did a previous round where we did EXACTLY the same
thing, and Apache Connectors Framework was the winning selection,
followed by ManifoldCF.  It sounds now like you are looking for yet a
third round?  Unless you claim that the candidate list was simply not
broad enough, I can see no hope of gain by doing that.

Karl

On Wed, Sep 29, 2010 at 11:01 AM, Upayavira u...@odoko.co.uk wrote:


Some while back I suggested manifolio. But that breaches the four
syllable rule :-)

How about Manifole?

I'd say rather than bursting into votes, keep the discussion going, I
suspect you'll know when you've got enough of the community behind you,
and when it is then worth wrapping the whole thing up with a vote - at
which point the vote is a mere formality.

Worth giving it the effort now, see this recent post [1] - a name is
going to stay with us all for a long time!

Upayavira

[1] http://enthusiasm.cozy.org/archives/2010/09/first-time-right

On Tue, 2010-09-28 at 20:08 -0400, Karl Wright wrote:


Actually, an abbreviation of AMCF is not bad either kinda like
that myself.  But I'm still not sure I like any of the book title
choices I've offered myself here.

Do we dare use Manifold Connectors Framework in Action?  and
describe AMCF as Manifold Connectors Framework at times?

Karl

On Tue, Sep 28, 2010 at 8:04 PM, Karl Wright daddy...@gmail.com 
wrote:

 If this is adopted, I'm thinking we could use it in the following 
 ways:

 Abbreviation: MCF
 Short name: ManifoldCF
 Qualified short name: Apache ManifoldCF
 Fully qualified and unabbreviated name: the Apache Manifold
 Connectors Framework

 I'm not quite sure what the world will think of that last usage, 
 since

 it does not contain the trademark.  Then again, neither does the
 abbreviation.  But I'm not sure I'd dare make the book title be
 Apache Manifold Connectors Framework in Action.  It would probably
 need to be Apache ManifoldCF in Action, or just ManifoldCF in
 Action.

 Grant, you wrote a book.  What do you think?  Which title should be 
  

 used?

 Karl



 On Tue, Sep 28, 2010 at 7:30 PM, Jack Krupansky
 jack.krupan...@lucidimagination.com wrote:
 -1 for me. Standing alone it's an okay name, but trying to actually
  use it
 is a pain (and we might as well call it MCF). But I'll certainly go
  along
 with the majority.

 -- Jack Krupansky

 --
 From: Karl Wright daddy...@gmail.com
 Sent: Tuesday, September 28, 2010 7:25 PM
 To: connectors-dev@incubator.apache.org
 Subject: Re: [VOTE] Rename Apache Connectors Framework to 
 ManifoldCF


 Ok, I just want an up-or-down vote on ManifoldCF at this point. 
 +1

  from
 me.

 Karl

 On Tue, Sep 28, 2010 at 7:22 PM, Mark Miller 
 markrmil...@gmail.com

 wrote:

 On 9/28/10 7:10 PM, Jack Krupansky wrote:

 Fair enough. I could live with any of the other choices, but 
  

 having this
 CF suffix really messes a lot of stuff up and is less 
 practical

  than
 any of the other names

Re: [VOTE] Rename Apache Connectors Framework to ManifoldCF

2010-09-29 Thread Jack Krupansky
Can we stick with a 8-person minimum quorum for this and most other votes? 
In other words the vote closes at the deadline if there is a quorum, other 
it stays open until 5 p.m. after there is a quorum.


-- Jack Krupansky

--
From: Karl Wright daddy...@gmail.com
Sent: Wednesday, September 29, 2010 1:38 PM
To: connectors-dev@incubator.apache.org
Subject: Re: [VOTE] Rename Apache Connectors Framework to ManifoldCF


= 0 means failure.

Karl

On Wed, Sep 29, 2010 at 12:09 PM, Jack Krupansky
jack.krupan...@lucidimagination.com wrote:

Ah, okay, that's cool. So if the vote fails (= 0 or  0?), we would then
vote on the next choice, which is... Manicon.

-- Jack Krupansky

--
From: Karl Wright daddy...@gmail.com
Sent: Wednesday, September 29, 2010 12:01 PM
To: connectors-dev@incubator.apache.org
Subject: Re: [VOTE] Rename Apache Connectors Framework to ManifoldCF


We're trying for an up/down confirmation of ManifoldCF as a new name
for the project.  If it succeeds, that's our name.  If it fails, it's
on to the next-highest-ranking choice.  Right now score is 0.

Karl

On Wed, Sep 29, 2010 at 11:34 AM, Jack Krupansky
jack.krupan...@lucidimagination.com wrote:


My apologies for not vetting connex properly. I actually do recall 
seeing

the sourceforge project  project now when reminded of it, but I was
working
too quickly and didn't get around to editing my list right away and
forgot
about it. And I assumed that Karl was going to vet names properly
anyway.
So... it's my fault that manicon wasn't there as the top choice when
people voted!

In any case, I think I have lost track of where we were in the 
process...

voting +1/-1 on keeping ManifoldCF vs. staying with Apache Connectors
Framework, I think? And with other Karl and I voting on that so far (+1
and
-1)? I'll let Karl send out a proper reminder of wherever he says we 
are.


-- Jack Krupansky

--
From: Karl Wright daddy...@gmail.com
Sent: Wednesday, September 29, 2010 11:08 AM
To: connectors-dev@incubator.apache.org
Subject: Re: [VOTE] Rename Apache Connectors Framework to ManifoldCF


May I point out that we've been discussing this issue for over two
months
now?

We just went through a process of gathering names, came up with some
35 candidates, and voted to rank them.  This process just ended, our
best candidate turned out to not have been submitted properly, but we
still have some 15 other names that people legitimately selected,
ranked in order.

Prior to that, we did a previous round where we did EXACTLY the same
thing, and Apache Connectors Framework was the winning selection,
followed by ManifoldCF.  It sounds now like you are looking for yet a
third round?  Unless you claim that the candidate list was simply not
broad enough, I can see no hope of gain by doing that.

Karl

On Wed, Sep 29, 2010 at 11:01 AM, Upayavira u...@odoko.co.uk wrote:


Some while back I suggested manifolio. But that breaches the four
syllable rule :-)

How about Manifole?

I'd say rather than bursting into votes, keep the discussion going, I
suspect you'll know when you've got enough of the community behind 
you,
and when it is then worth wrapping the whole thing up with a vote - 
at

which point the vote is a mere formality.

Worth giving it the effort now, see this recent post [1] - a name is
going to stay with us all for a long time!

Upayavira

[1] http://enthusiasm.cozy.org/archives/2010/09/first-time-right

On Tue, 2010-09-28 at 20:08 -0400, Karl Wright wrote:


Actually, an abbreviation of AMCF is not bad either kinda like
that myself.  But I'm still not sure I like any of the book title
choices I've offered myself here.

Do we dare use Manifold Connectors Framework in Action?  and
describe AMCF as Manifold Connectors Framework at times?

Karl

On Tue, Sep 28, 2010 at 8:04 PM, Karl Wright daddy...@gmail.com
wrote:
 If this is adopted, I'm thinking we could use it in the following 
  

 ways:

 Abbreviation: MCF
 Short name: ManifoldCF
 Qualified short name: Apache ManifoldCF
 Fully qualified and unabbreviated name: the Apache Manifold
 Connectors Framework

 I'm not quite sure what the world will think of that last usage, 
 since
 it does not contain the trademark.  Then again, neither does the
 abbreviation.  But I'm not sure I'd dare make the book title be
 Apache Manifold Connectors Framework in Action.  It would 
 probably

 need to be Apache ManifoldCF in Action, or just ManifoldCF in
 Action.

 Grant, you wrote a book.  What do you think?  Which title should 
 be

   
 used?

 Karl



 On Tue, Sep 28, 2010 at 7:30 PM, Jack Krupansky
 jack.krupan...@lucidimagination.com wrote:
 -1 for me. Standing alone it's an okay name, but trying to 
 actually

  use it
 is a pain (and we might as well call it MCF). But I'll certainly 
 go

  along
 with the majority.

 -- Jack Krupansky

Re: [VOTE] Rename Apache Connectors Framework to ManifoldCF

2010-09-28 Thread Jack Krupansky
Fair enough. I could live with any of the other choices, but having this 
CF suffix really messes a lot of stuff up and is less practical than any 
of the other names. Basically, it means we may end up having to use MCF as 
the shorthand name.


Wait... stop the presses... I just realized that ManifoldCF violates 
selection rule #5:


(5) No more than 4 syllables

Man-I-fold-C-F (or is in Ma-ni-fold-C-F.)

That's five syllables.

And, technically, I would say that it at least half violates the spirit of 
rule #1:


(1) It's a single word

It is a single word plus this extra CF acronym thing.

So, next candidate on the list was... Manicon, 19

Unless it has legal problems, it fits our requirements.

-- Jack Krupansky

--
From: Karl Wright daddy...@gmail.com
Sent: Tuesday, September 28, 2010 6:52 PM
To: connectors-dev@incubator.apache.org
Subject: Re: [VOTE] Rename Apache Connectors Framework to ManifoldCF


Jack,

That's one of the main purposes of having everyone list choices by
priority.  If one doesn't work, there are others you can use.

I don't want to open that vote again unless the community decides that
the list of candidate names was simply not rich enough to furnish a
good choice.

Karl


On Tue, Sep 28, 2010 at 6:49 PM, Jack Krupansky
jack.krupan...@lucidimagination.com wrote:

Or Nocon or Noman.

I know people are tired of voting, but I think we should really re-vote 
for

the revised candidate list with Connex removed.

-- Jack Krupansky

--
From: Mark Miller markrmil...@gmail.com
Sent: Tuesday, September 28, 2010 6:43 PM
To: connectors-dev@incubator.apache.org
Subject: Re: [VOTE] Rename Apache Connectors Framework to ManifoldCF


hmmm...I think I'm all voted out. Can we just call it nothing?

On 9/28/10 6:40 PM, Karl Wright wrote:


Vote +1 to rename Apache Connectors Framework to Apache ManifoldCF.
Vote -1 to keep the project name of Connectors Framework, or to retain
Connex, if that wins its vote.

This vote also expires end of day on Friday.

Note: Manifold is a trademark for a GIS software product.  However,
I agree with Grant that ManifoldCF appearing under the Apache label
should be safe to be used.  But you should recognize that this vote is
not merely a referendum on the name itself, but also on the
suitability of the name in a legal context.

Karl






Re: [VOTE] Rename Apache Connectors Framework to ManifoldCF

2010-09-28 Thread Jack Krupansky
Karl, you are the de facto naming czar. You get to take the community input 
and figure out how to interpret it so that it so that it reflects a 
general sense of the spirit of the community.


So, now you get to rule on my objections to ManifoldCF! And the chips can 
fall where they may.


-- Jack Krupansky

--
From: Karl Wright daddy...@gmail.com
Sent: Tuesday, September 28, 2010 6:46 PM
To: connectors-dev@incubator.apache.org
Subject: Re: [VOTE] Rename Apache Connectors Framework to ManifoldCF


I'm tempted.  Apache Nothing has a nice ring to it. ;-)

Maybe we should just give up with the voting and appoint a Naming
Czar.  Seriously.

Karl

On Tue, Sep 28, 2010 at 6:43 PM, Mark Miller markrmil...@gmail.com 
wrote:

hmmm...I think I'm all voted out. Can we just call it nothing?

On 9/28/10 6:40 PM, Karl Wright wrote:

Vote +1 to rename Apache Connectors Framework to Apache ManifoldCF.
Vote -1 to keep the project name of Connectors Framework, or to retain
Connex, if that wins its vote.

This vote also expires end of day on Friday.

Note: Manifold is a trademark for a GIS software product.  However,
I agree with Grant that ManifoldCF appearing under the Apache label
should be safe to be used.  But you should recognize that this vote is
not merely a referendum on the name itself, but also on the
suitability of the name in a legal context.

Karl





Re: [VOTE] Rename Apache Connectors Framework to ManifoldCF

2010-09-28 Thread Jack Krupansky
-1 for me. Standing alone it's an okay name, but trying to actually use it 
is a pain (and we might as well call it MCF). But I'll certainly go along 
with the majority.


-- Jack Krupansky

--
From: Karl Wright daddy...@gmail.com
Sent: Tuesday, September 28, 2010 7:25 PM
To: connectors-dev@incubator.apache.org
Subject: Re: [VOTE] Rename Apache Connectors Framework to ManifoldCF

Ok, I just want an up-or-down vote on ManifoldCF at this point.  +1 from 
me.


Karl

On Tue, Sep 28, 2010 at 7:22 PM, Mark Miller markrmil...@gmail.com 
wrote:

On 9/28/10 7:10 PM, Jack Krupansky wrote:

Fair enough. I could live with any of the other choices, but having this
CF suffix really messes a lot of stuff up and is less practical than
any of the other names. Basically, it means we may end up having to use
MCF as the shorthand name.

Wait... stop the presses... I just realized that ManifoldCF violates
selection rule #5:

(5) No more than 4 syllables

Man-I-fold-C-F (or is in Ma-ni-fold-C-F.)

That's five syllables.


ManifoldCF was already in the running. And its obvious that having too
many syllables is not a problem - it was the second most voted name -
for the *second* time at least (who can track all these votes).



And, technically, I would say that it at least half violates the spirit
of rule #1:

(1) It's a single word

It is a single word plus this extra CF acronym thing.


That's a stretch that the rational part of my brain is going to ignore.
This is no argument.



So, next candidate on the list was... Manicon, 19

Unless it has legal problems, it fits our requirements.


Okay, lets vote again. For some reason ManifoldCF will stop topping the
list why? Everyone will come to their senses? Some of us are so sick of
this name thing we won't vote, and if your lucky those will be the
ManifoldCF supporters? I mean come on...



-- Jack Krupansky

--
From: Karl Wright daddy...@gmail.com
Sent: Tuesday, September 28, 2010 6:52 PM
To: connectors-dev@incubator.apache.org
Subject: Re: [VOTE] Rename Apache Connectors Framework to ManifoldCF


Jack,

That's one of the main purposes of having everyone list choices by
priority.  If one doesn't work, there are others you can use.

I don't want to open that vote again unless the community decides that
the list of candidate names was simply not rich enough to furnish a
good choice.

Karl


On Tue, Sep 28, 2010 at 6:49 PM, Jack Krupansky
jack.krupan...@lucidimagination.com wrote:

Or Nocon or Noman.

I know people are tired of voting, but I think we should really
re-vote for
the revised candidate list with Connex removed.

-- Jack Krupansky

--
From: Mark Miller markrmil...@gmail.com
Sent: Tuesday, September 28, 2010 6:43 PM
To: connectors-dev@incubator.apache.org
Subject: Re: [VOTE] Rename Apache Connectors Framework to ManifoldCF


hmmm...I think I'm all voted out. Can we just call it nothing?

On 9/28/10 6:40 PM, Karl Wright wrote:


Vote +1 to rename Apache Connectors Framework to Apache ManifoldCF.
Vote -1 to keep the project name of Connectors Framework, or to 
retain

Connex, if that wins its vote.

This vote also expires end of day on Friday.

Note: Manifold is a trademark for a GIS software product. 
However,

I agree with Grant that ManifoldCF appearing under the Apache label
should be safe to be used.  But you should recognize that this vote 
is

not merely a referendum on the name itself, but also on the
suitability of the name in a legal context.

Karl









Re: [VOTE] Rename Apache Connectors Framework to ManifoldCF

2010-09-28 Thread Jack Krupansky
That is roughly the usage I would expect. As far as a book title, tough 
call. There is the length issue. If somebody already knows of ManifoldCF, 
Apache ManifoldCF in Action makes sense, but Apache Manifold Connectors 
Framework in Action is a bit more descriptive.


Bottom line, a name with three basic, core variations: ManifoldCF, MCF, 
Manifold Connectors Framework.


-- Jack Krupansky

--
From: Karl Wright daddy...@gmail.com
Sent: Tuesday, September 28, 2010 8:04 PM
To: connectors-dev@incubator.apache.org
Subject: Re: [VOTE] Rename Apache Connectors Framework to ManifoldCF


If this is adopted, I'm thinking we could use it in the following ways:

Abbreviation: MCF
Short name: ManifoldCF
Qualified short name: Apache ManifoldCF
Fully qualified and unabbreviated name: the Apache Manifold
Connectors Framework

I'm not quite sure what the world will think of that last usage, since
it does not contain the trademark.  Then again, neither does the
abbreviation.  But I'm not sure I'd dare make the book title be
Apache Manifold Connectors Framework in Action.  It would probably
need to be Apache ManifoldCF in Action, or just ManifoldCF in
Action.

Grant, you wrote a book.  What do you think?  Which title should be used?

Karl



On Tue, Sep 28, 2010 at 7:30 PM, Jack Krupansky
jack.krupan...@lucidimagination.com wrote:
-1 for me. Standing alone it's an okay name, but trying to actually use 
it

is a pain (and we might as well call it MCF). But I'll certainly go along
with the majority.

-- Jack Krupansky

--
From: Karl Wright daddy...@gmail.com
Sent: Tuesday, September 28, 2010 7:25 PM
To: connectors-dev@incubator.apache.org
Subject: Re: [VOTE] Rename Apache Connectors Framework to ManifoldCF


Ok, I just want an up-or-down vote on ManifoldCF at this point.  +1 from
me.

Karl

On Tue, Sep 28, 2010 at 7:22 PM, Mark Miller markrmil...@gmail.com
wrote:


On 9/28/10 7:10 PM, Jack Krupansky wrote:


Fair enough. I could live with any of the other choices, but having 
this

CF suffix really messes a lot of stuff up and is less practical than
any of the other names. Basically, it means we may end up having to 
use

MCF as the shorthand name.

Wait... stop the presses... I just realized that ManifoldCF violates
selection rule #5:

(5) No more than 4 syllables

Man-I-fold-C-F (or is in Ma-ni-fold-C-F.)

That's five syllables.


ManifoldCF was already in the running. And its obvious that having too
many syllables is not a problem - it was the second most voted name -
for the *second* time at least (who can track all these votes).



And, technically, I would say that it at least half violates the 
spirit

of rule #1:

(1) It's a single word

It is a single word plus this extra CF acronym thing.


That's a stretch that the rational part of my brain is going to ignore.
This is no argument.



So, next candidate on the list was... Manicon, 19

Unless it has legal problems, it fits our requirements.


Okay, lets vote again. For some reason ManifoldCF will stop topping the
list why? Everyone will come to their senses? Some of us are so sick of
this name thing we won't vote, and if your lucky those will be the
ManifoldCF supporters? I mean come on...



-- Jack Krupansky

--
From: Karl Wright daddy...@gmail.com
Sent: Tuesday, September 28, 2010 6:52 PM
To: connectors-dev@incubator.apache.org
Subject: Re: [VOTE] Rename Apache Connectors Framework to ManifoldCF


Jack,

That's one of the main purposes of having everyone list choices by
priority.  If one doesn't work, there are others you can use.

I don't want to open that vote again unless the community decides 
that

the list of candidate names was simply not rich enough to furnish a
good choice.

Karl


On Tue, Sep 28, 2010 at 6:49 PM, Jack Krupansky
jack.krupan...@lucidimagination.com wrote:


Or Nocon or Noman.

I know people are tired of voting, but I think we should really
re-vote for
the revised candidate list with Connex removed.

-- Jack Krupansky

--
From: Mark Miller markrmil...@gmail.com
Sent: Tuesday, September 28, 2010 6:43 PM
To: connectors-dev@incubator.apache.org
Subject: Re: [VOTE] Rename Apache Connectors Framework to ManifoldCF


hmmm...I think I'm all voted out. Can we just call it nothing?

On 9/28/10 6:40 PM, Karl Wright wrote:


Vote +1 to rename Apache Connectors Framework to Apache 
ManifoldCF.

Vote -1 to keep the project name of Connectors Framework, or to
retain
Connex, if that wins its vote.

This vote also expires end of day on Friday.

Note: Manifold is a trademark for a GIS software product. 
However,
I agree with Grant that ManifoldCF appearing under the Apache 
label
should be safe to be used.  But you should recognize that this 
vote

is
not merely a referendum on the name itself, but also on the
suitability of the name in a legal context.

Karl











Re: [VOTE] Select a name to possibly replace Apache Connectors Framework

2010-09-24 Thread Jack Krupansky

When does this stage of voting close?

-- Jack Krupansky

--
From: Karl Wright daddy...@gmail.com
Sent: Thursday, September 23, 2010 5:28 PM
To: connectors-dev connectors-dev@incubator.apache.org
Subject: [VOTE] Select a name to possibly replace Apache Connectors 
Framework



Folks,

Grant feels we would have a better chance of graduating from
incubation without changes if we adopt a new name.  There will thus be
two votes.  First vote is designed to arrive at a name, and the second
vote will be on whether to use that highest-point name instead of
Apache Connectors Framework.

Because the list is quite long this time, please select your favorite
8 choices, in order of preference.  If you submit duplicate choices,
only the first of each duplicate will be counted, and the others will
receive zero points.  So it is in your interest to not select any
duplicates.  All of these choices have been already screened to
fulfill specific criteria, such as avoidance of trademarks or heavily
used words.

The list of candidates is:

Ayvitraya
Conex
Connex
Connie
Connx
Contango
Conton
Contor
Contour
Conx
Heterolink
Heterosource
Heteroweb
Manicon
ManifoldCF
Manifolio
Manilink
Maniplex
Manisource
Maniweb
Multicon
Multiconnect
Multiconnex
Ralph
Reconto
RepoMan
Repositor
Recon
Reconex
Reconn
Reconnex
Reconnx
Reconx

Let the voting begin!
Karl 




Re: Soliciting more potential names for the project formerly known as LCF

2010-09-22 Thread Jack Krupansky

I'd give people another day since it seems so quiet.

-- Jack Krupansky

--
From: Karl Wright daddy...@gmail.com
Sent: Wednesday, September 22, 2010 3:40 AM
To: connectors-dev@incubator.apache.org
Subject: Re: Soliciting more potential names for the project formerly known 
as LCF



Any objections if I close this list as of 5:00 PM EDT today, and call
the vote then?

Karl

On Tue, Sep 21, 2010 at 5:08 PM, Jack Krupansky
jack.krupan...@lucidimagination.com wrote:

I few more from my noon walk:

Recon [Re from Repository and con from connectors]
Reconn
Conx [ala Connects] - treat these four as one for initial voting, then
separate
Conex
Connx
Connex
Reconx [ala Repository and Connects] - ditto
Reconex
Reconnx
Reconnex
Contor [Con and tor from Connector]
Contour [Contor into a word to avoid misspelling]
Contango [Connectors are a bit of a dance; a term from commodities 
futures

trading]

-- Jack Krupansky

--
From: Karl Wright daddy...@gmail.com
Sent: Tuesday, September 21, 2010 4:27 PM
To: connectors-dev connectors-dev@incubator.apache.org
Subject: Soliciting more potential names for the project formerly known 
as

LCF


We're going to have another round of name selection, based on feedback
from the board.  We've also pretty much decided that the last round
did not have sufficient breadth, so here's an open solicitation call
for potential names for the next round.

So hold on to your hats, and throw your suggestions into the ring.

One word of caution.  Before offering a name, please do your best to
make sure it meets the following criteria:

(1) It's a single word
(2) It's not a very commonly-used word, or a well-known place; ideally
it's not a word you would find in the OED
(3) It does not have negative connotations
(4) A quick google with it in the context of open source shows nothing
(5) No more than 4 syllables

Here's what we've got so far for this round:


Congo
Connie
Heterolink
Heterosource
Heteroweb
Manicon
Manifolio
Manilink
Maniplex
Manisource
Maniweb
Multicon
Ralph
Reconto
RepoMan
Repositor
Reptile

Some of these don't fulfill the criteria above (e.g. Congo and
Reptile), and may be stricken from the voting list as a result.  But
feel free to riff at will... ;-)

Karl





Re: Soliciting more potential names for the project formerly known as LCF

2010-09-22 Thread Jack Krupansky
Great! Anyone on this list can nominate names and vote. The more 
participation, the better.


-- Jack Krupansky

--
From: George Aroush geo...@aroush.net
Sent: Wednesday, September 22, 2010 10:47 PM
To: connectors-dev@incubator.apache.org
Subject: RE: Soliciting more potential names for the project formerly known 
as LCF



I have been watching this project from the side line...

Can I nominate Ayvitraya?  From:
http://en.wikipedia.org/wiki/Fictional_universe_in_Avatar#The_Tree_of_Souls

Ayvitraya Ramunong in Na'vi is a tree where the Na'vi are able to
communicate with the biological network that exists.

-- George


-Original Message-
From: Karl Wright [mailto:daddy...@gmail.com]
Sent: Wednesday, September 22, 2010 8:07 PM
To: connectors-dev@incubator.apache.org
Subject: Re: Soliciting more potential names for the project formerly 
known

as LCF

One more day... so nominations close tomorrow at 5:00 PM.

Karl

On Wed, Sep 22, 2010 at 6:37 AM, Jack Krupansky
jack.krupan...@lucidimagination.com wrote:

I'd give people another day since it seems so quiet.

-- Jack Krupansky

--
From: Karl Wright daddy...@gmail.com
Sent: Wednesday, September 22, 2010 3:40 AM
To: connectors-dev@incubator.apache.org
Subject: Re: Soliciting more potential names for the project formerly

known

as LCF


Any objections if I close this list as of 5:00 PM EDT today, and call
the vote then?

Karl

On Tue, Sep 21, 2010 at 5:08 PM, Jack Krupansky
jack.krupan...@lucidimagination.com wrote:


I few more from my noon walk:

Recon [Re from Repository and con from connectors]
Reconn
Conx [ala Connects] - treat these four as one for initial voting, then
separate
Conex
Connx
Connex
Reconx [ala Repository and Connects] - ditto
Reconex
Reconnx
Reconnex
Contor [Con and tor from Connector]
Contour [Contor into a word to avoid misspelling]
Contango [Connectors are a bit of a dance; a term from commodities
futures
trading]

-- Jack Krupansky

--
From: Karl Wright daddy...@gmail.com
Sent: Tuesday, September 21, 2010 4:27 PM
To: connectors-dev connectors-dev@incubator.apache.org
Subject: Soliciting more potential names for the project formerly known
as
LCF


We're going to have another round of name selection, based on feedback
from the board.  We've also pretty much decided that the last round
did not have sufficient breadth, so here's an open solicitation call
for potential names for the next round.

So hold on to your hats, and throw your suggestions into the ring.

One word of caution.  Before offering a name, please do your best to
make sure it meets the following criteria:

(1) It's a single word
(2) It's not a very commonly-used word, or a well-known place; ideally
it's not a word you would find in the OED
(3) It does not have negative connotations
(4) A quick google with it in the context of open source shows

nothing

(5) No more than 4 syllables

Here's what we've got so far for this round:


Congo
Connie
Heterolink
Heterosource
Heteroweb
Manicon
Manifolio
Manilink
Maniplex
Manisource
Maniweb
Multicon
Ralph
Reconto
RepoMan
Repositor
Reptile

Some of these don't fulfill the criteria above (e.g. Congo and
Reptile), and may be stricken from the voting list as a result.  But
feel free to riff at will... ;-)

Karl









Re: Exploring ManifoldCF ramifications

2010-09-21 Thread Jack Krupansky
I concur with having a base class for all ACF exceptions, and then the 
specific exceptions extend that base.


-- Jack Krupansky

--
From: Karl Wright daddy...@gmail.com
Sent: Tuesday, September 21, 2010 10:08 AM
To: connectors-dev@incubator.apache.org
Subject: Re: Exploring ManifoldCF ramifications


The exception naming issue is noted, but that's really a separate
problem.  The IOException exception comes from the IO subsystem, and
it's the base exception of everything from an encoding exception
through a socket problem through a timeout.  ACFException is a similar
base exception class, except it comes from ACF.  So there is a rough
parity there.  If you want to challenge the use of base exception
classes, so be it, but that's not the difficulty with ManifoldCF.

Maybe we don't understand your intended usage of ManifoldCF, since it
seems to me like you possibly meant Apache Manifold Connectors
Framework for the full name?  If so, I certainly don't think any of
us got that.  Can you clarify/confirm?

Karl





On Tue, Sep 21, 2010 at 9:58 AM, Grant Ingersoll gsing...@apache.org 
wrote:
Let's not overly analyze things here.  I'm not saying we need to pick 
Manifold CF, but if we do, we certainly can solve these writing issues by 
either re-writing the sentences in question (instead of search/replace) 
and just use MCF.


As for the Exceptions, I find an exception named ACFException meaningless 
to an app dev. anyway.  Duh it's an ACFException, it came from ACF.  You 
don't call an IOException a JavaException just b/c it came from Java, you 
give it a name that relates to the thing that went wrong, as in something 
went wrong doing IO.  Give it a name that says what happened.


On Sep 21, 2010, at 3:16 AM, Karl Wright wrote:


Folks,

The ManifoldCF name possibility leads to some challenges as far as our
documentation is concerned.  I thought that it might be a good idea
during the vote to explore those to see what people thought.

Here are some examples of how Apache Connectors Framework might get
used in text:

Apache Connectors Framework is an interesting offering from Apache.
ACF links repositories with search indices.  That's what ACF does.
The Apache Connectors Framework is a framework for repository
connectors primarily.

The above is not technically proper.  So instead we might conceivably
have done this:

Apache Connectors Framework is an interesting offering from Apache.
Connectors Framework links repositories with search indices.  That's
what CF does.  The Connectors Framework is a framework for repository
connectors primarily.

What is the equivalent for Apache ManifoldCF?

Apache ManifoldCF is an interesting offering from Apache.  ManifoldCF
links repositories with search indices.  That's what MCF does.
ManifoldCF is a framework for repository connectors primarily.

Note that the difference is that we would never say, The Apache
ManifoldCF...  or The Apache Manifold Connectors Framework..., just
ManifoldCF

Would we want to use the MCF abbreviation at all?  Or just convert ACF
- ManifoldCF wherever it is found in documentation?

Similarly, the handle acf in package and class names would need to
be addressed:

org.apache.acf.core.interfaces.ACFException - ?
org.apache.acf.core.system.ACF - ?

...bearing in mind that you'd better choose a consistent treatment for
uppercase ACF in both contexts.

(FWIW, my initial thought is:

org.apache.acf.core.interfaces.ACFException -
org.apache.mcf.core.interfaces.ManifoldCFException
org.apache.acf.core.system.ACF - org.apache.mcf.core.system.ManifoldCF)

Thoughts?

Karl


--
Grant Ingersoll
http://lucenerevolution.org Apache Lucene/Solr Conference, Boston Oct 7-8




Re: Exploring ManifoldCF ramifications

2010-09-21 Thread Jack Krupansky
My interpretation from the beginning is that there is a formal name 
prefixed with Apache that would get used external to the project to refer 
to the project, but then within the project we would just use the 
shorthand name, whether that means simply dropping the Apache or 
abbreviating the name with an acronym. If the project name was a short name 
to begin with, then abbreviation would not be needed, but if the name is too 
long and clumsy, an abbreviation might be called for. Manifold would fit 
the short prescription fine, but with ManifoldCF, the temptation to 
shorten it (some people, like me, are clumsy with too much shift key action) 
to MCF is somewhat... obvious. And when you lower-case the name for 
package names to manifoldcf, it kind of looks weird.


-- Jack Krupansky

--
From: Grant Ingersoll gsing...@apache.org
Sent: Tuesday, September 21, 2010 9:58 AM
To: connectors-dev@incubator.apache.org
Subject: Re: Exploring ManifoldCF ramifications

Let's not overly analyze things here.  I'm not saying we need to pick 
Manifold CF, but if we do, we certainly can solve these writing issues by 
either re-writing the sentences in question (instead of search/replace) 
and just use MCF.


As for the Exceptions, I find an exception named ACFException meaningless 
to an app dev. anyway.  Duh it's an ACFException, it came from ACF.  You 
don't call an IOException a JavaException just b/c it came from Java, you 
give it a name that relates to the thing that went wrong, as in something 
went wrong doing IO.  Give it a name that says what happened.


On Sep 21, 2010, at 3:16 AM, Karl Wright wrote:


Folks,

The ManifoldCF name possibility leads to some challenges as far as our
documentation is concerned.  I thought that it might be a good idea
during the vote to explore those to see what people thought.

Here are some examples of how Apache Connectors Framework might get
used in text:

Apache Connectors Framework is an interesting offering from Apache.
ACF links repositories with search indices.  That's what ACF does.
The Apache Connectors Framework is a framework for repository
connectors primarily.

The above is not technically proper.  So instead we might conceivably
have done this:

Apache Connectors Framework is an interesting offering from Apache.
Connectors Framework links repositories with search indices.  That's
what CF does.  The Connectors Framework is a framework for repository
connectors primarily.

What is the equivalent for Apache ManifoldCF?

Apache ManifoldCF is an interesting offering from Apache.  ManifoldCF
links repositories with search indices.  That's what MCF does.
ManifoldCF is a framework for repository connectors primarily.

Note that the difference is that we would never say, The Apache
ManifoldCF...  or The Apache Manifold Connectors Framework..., just
ManifoldCF

Would we want to use the MCF abbreviation at all?  Or just convert ACF
- ManifoldCF wherever it is found in documentation?

Similarly, the handle acf in package and class names would need to
be addressed:

org.apache.acf.core.interfaces.ACFException - ?
org.apache.acf.core.system.ACF - ?

...bearing in mind that you'd better choose a consistent treatment for
uppercase ACF in both contexts.

(FWIW, my initial thought is:

org.apache.acf.core.interfaces.ACFException -
org.apache.mcf.core.interfaces.ManifoldCFException
org.apache.acf.core.system.ACF - org.apache.mcf.core.system.ManifoldCF)

Thoughts?

Karl


--
Grant Ingersoll
http://lucenerevolution.org Apache Lucene/Solr Conference, Boston Oct 7-8



Re: Exploring ManifoldCF ramifications

2010-09-21 Thread Jack Krupansky
That's a perfect example of what I was trying to suggest and avoids the 
usage problems. Although it has too many syllables for my taste, but that's 
just me.


-- Jack Krupansky

--
From: Upayavira u...@odoko.co.uk
Sent: Tuesday, September 21, 2010 10:39 AM
To: connectors-dev@incubator.apache.org
Subject: Re: Exploring ManifoldCF ramifications


Butting in here. You can 'twist' the manifold word in other ways, e.g.
manifolio, or some such - full name The Apache Manifolio Connector
Framework, short name manifolio.

Upayavira

On Tue, 2010-09-21 at 10:26 -0400, Jack Krupansky wrote:

My interpretation from the beginning is that there is a formal name
prefixed with Apache that would get used external to the project to 
refer

to the project, but then within the project we would just use the
shorthand name, whether that means simply dropping the Apache or
abbreviating the name with an acronym. If the project name was a short 
name
to begin with, then abbreviation would not be needed, but if the name is 
too
long and clumsy, an abbreviation might be called for. Manifold would 
fit

the short prescription fine, but with ManifoldCF, the temptation to
shorten it (some people, like me, are clumsy with too much shift key 
action)

to MCF is somewhat... obvious. And when you lower-case the name for
package names to manifoldcf, it kind of looks weird.

-- Jack Krupansky

--
From: Grant Ingersoll gsing...@apache.org
Sent: Tuesday, September 21, 2010 9:58 AM
To: connectors-dev@incubator.apache.org
Subject: Re: Exploring ManifoldCF ramifications

 Let's not overly analyze things here.  I'm not saying we need to pick
 Manifold CF, but if we do, we certainly can solve these writing issues 
 by

 either re-writing the sentences in question (instead of search/replace)
 and just use MCF.

 As for the Exceptions, I find an exception named ACFException 
 meaningless
 to an app dev. anyway.  Duh it's an ACFException, it came from ACF. 
 You
 don't call an IOException a JavaException just b/c it came from Java, 
 you
 give it a name that relates to the thing that went wrong, as in 
 something

 went wrong doing IO.  Give it a name that says what happened.

 On Sep 21, 2010, at 3:16 AM, Karl Wright wrote:

 Folks,

 The ManifoldCF name possibility leads to some challenges as far as our
 documentation is concerned.  I thought that it might be a good idea
 during the vote to explore those to see what people thought.

 Here are some examples of how Apache Connectors Framework might get
 used in text:

 Apache Connectors Framework is an interesting offering from Apache.
 ACF links repositories with search indices.  That's what ACF does.
 The Apache Connectors Framework is a framework for repository
 connectors primarily.

 The above is not technically proper.  So instead we might conceivably
 have done this:

 Apache Connectors Framework is an interesting offering from Apache.
 Connectors Framework links repositories with search indices.  That's
 what CF does.  The Connectors Framework is a framework for repository
 connectors primarily.

 What is the equivalent for Apache ManifoldCF?

 Apache ManifoldCF is an interesting offering from Apache.  ManifoldCF
 links repositories with search indices.  That's what MCF does.
 ManifoldCF is a framework for repository connectors primarily.

 Note that the difference is that we would never say, The Apache
 ManifoldCF...  or The Apache Manifold Connectors Framework..., just
 ManifoldCF

 Would we want to use the MCF abbreviation at all?  Or just convert ACF
 - ManifoldCF wherever it is found in documentation?

 Similarly, the handle acf in package and class names would need to
 be addressed:

 org.apache.acf.core.interfaces.ACFException - ?
 org.apache.acf.core.system.ACF - ?

 ...bearing in mind that you'd better choose a consistent treatment for
 uppercase ACF in both contexts.

 (FWIW, my initial thought is:

 org.apache.acf.core.interfaces.ACFException -
 org.apache.mcf.core.interfaces.ManifoldCFException
 org.apache.acf.core.system.ACF - 
 org.apache.mcf.core.system.ManifoldCF)


 Thoughts?

 Karl

 --
 Grant Ingersoll
 http://lucenerevolution.org Apache Lucene/Solr Conference, Boston Oct 
 7-8







Re: Exploring ManifoldCF ramifications

2010-09-21 Thread Jack Krupansky
It sounds like the vote is moot pending resolution of the doc/code usage 
issue, but I'll go ahead and withdraw my vote at least temporarily pending 
further evolution of the issue.


Are we going to reopen name suggestions? I few I had:

Congo [Con for connectors]
RepoMan [Repo for Repositories]
ConMan? [Con for Content, but too shady]
Reconto [Re for Repository, Con for Connector or Cont for Content]
Reptile [Rep for Repositories, tile as an allusion to organizing or 
arranging things]
Ralph [In honor of Ralph Kramden, bus driver on The Honeymooners. ACF is a 
form of software bus]

Connie [Conn for Connectors]
Repositor [Abbreviation  of Repository, tor of connector]

They may not be great, but maybe somebody can create derivative names to 
improve them. If there are trademark issues (e.g., Congo), maybe subtle 
variations (other than bolting on CF) could fix the problem.


-- Jack Krupansky

--
From: Karl Wright daddy...@gmail.com
Sent: Tuesday, September 21, 2010 9:28 AM
To: connectors-dev@incubator.apache.org
Subject: Re: Exploring ManifoldCF ramifications


Do you wish to change your vote, in that case?
Karl

On Tue, Sep 21, 2010 at 8:02 AM, Jack Krupansky
jack.krupan...@lucidimagination.com wrote:
I'd much prefer a simple, short, name. Using a descriptive phrase as a 
name
has these problems. Tacking on CF does indeed fix one problem, but at 
a

high cost.

That said, I am okay with a combo of a short name and a long name. So, if
the short name were Ralph, the long name would be Apache Ralph 
Connectors

Framework and we would speak of either the Apache Ralph Connectors
Framework or just Ralph. Class names would begin with the capitalized
short name, Ralph, and package and file names would use the lower-case,
ralph as in org.apache.ralph.core.interfaces.RalphException. And upon
graduation, the project would be housed at http://ralph.apache.org/.

Now, I wasn't seriously considering Ralph as a name for LCF, but... it
works for me.

-- Jack Krupansky

--
From: Karl Wright daddy...@gmail.com
Sent: Tuesday, September 21, 2010 3:16 AM
To: connectors-dev connectors-dev@incubator.apache.org
Subject: Exploring ManifoldCF ramifications


Folks,

The ManifoldCF name possibility leads to some challenges as far as our
documentation is concerned.  I thought that it might be a good idea
during the vote to explore those to see what people thought.

Here are some examples of how Apache Connectors Framework might get
used in text:

Apache Connectors Framework is an interesting offering from Apache.
ACF links repositories with search indices.  That's what ACF does.
The Apache Connectors Framework is a framework for repository
connectors primarily.

The above is not technically proper.  So instead we might conceivably
have done this:

Apache Connectors Framework is an interesting offering from Apache.
Connectors Framework links repositories with search indices.  That's
what CF does.  The Connectors Framework is a framework for repository
connectors primarily.

What is the equivalent for Apache ManifoldCF?

Apache ManifoldCF is an interesting offering from Apache.  ManifoldCF
links repositories with search indices.  That's what MCF does.
ManifoldCF is a framework for repository connectors primarily.

Note that the difference is that we would never say, The Apache
ManifoldCF...  or The Apache Manifold Connectors Framework..., just
ManifoldCF

Would we want to use the MCF abbreviation at all?  Or just convert ACF
- ManifoldCF wherever it is found in documentation?

Similarly, the handle acf in package and class names would need to
be addressed:

org.apache.acf.core.interfaces.ACFException - ?
org.apache.acf.core.system.ACF - ?

...bearing in mind that you'd better choose a consistent treatment for
uppercase ACF in both contexts.

(FWIW, my initial thought is:

org.apache.acf.core.interfaces.ACFException -
org.apache.mcf.core.interfaces.ManifoldCFException
org.apache.acf.core.system.ACF - org.apache.mcf.core.system.ManifoldCF)

Thoughts?

Karl





Re: Exploring ManifoldCF ramifications

2010-09-21 Thread Jack Krupansky
That sounds fine with me. Assuming by ACF you mean the full name Apache 
Connectors Framework with ACF as the local informal name, although in 
practice maybe everybody would refer to it, even externally, as ACF as 
well, as was with LCF.


-- Jack Krupansky

--
From: Karl Wright daddy...@gmail.com
Sent: Tuesday, September 21, 2010 11:16 AM
To: connectors-dev@incubator.apache.org
Subject: Re: Exploring ManifoldCF ramifications


Add: Maniplex

I think we've indeed reopened discussion on this issue.  I'll track
these names and once again hold a vote for the purposes of selection,
this time NOT including ACF.  Then we will have a final vote on
whether to replace ACF with the new name, whatever we come up with.
Agreed?

Karl

On Tue, Sep 21, 2010 at 11:13 AM, Jack Krupansky
jack.krupan...@lucidimagination.com wrote:

It sounds like the vote is moot pending resolution of the doc/code usage
issue, but I'll go ahead and withdraw my vote at least temporarily 
pending

further evolution of the issue.

Are we going to reopen name suggestions? I few I had:

Congo [Con for connectors]
RepoMan [Repo for Repositories]
ConMan? [Con for Content, but too shady]
Reconto [Re for Repository, Con for Connector or Cont for Content]
Reptile [Rep for Repositories, tile as an allusion to organizing or
arranging things]
Ralph [In honor of Ralph Kramden, bus driver on The Honeymooners. ACF 
is a

form of software bus]
Connie [Conn for Connectors]
Repositor [Abbreviation  of Repository, tor of connector]

They may not be great, but maybe somebody can create derivative names to
improve them. If there are trademark issues (e.g., Congo), maybe subtle
variations (other than bolting on CF) could fix the problem.

-- Jack Krupansky

--
From: Karl Wright daddy...@gmail.com
Sent: Tuesday, September 21, 2010 9:28 AM
To: connectors-dev@incubator.apache.org
Subject: Re: Exploring ManifoldCF ramifications


Do you wish to change your vote, in that case?
Karl

On Tue, Sep 21, 2010 at 8:02 AM, Jack Krupansky
jack.krupan...@lucidimagination.com wrote:


I'd much prefer a simple, short, name. Using a descriptive phrase as a
name
has these problems. Tacking on CF does indeed fix one problem, but 
at

a
high cost.

That said, I am okay with a combo of a short name and a long name. So, 
if

the short name were Ralph, the long name would be Apache Ralph
Connectors
Framework and we would speak of either the Apache Ralph Connectors
Framework or just Ralph. Class names would begin with the 
capitalized
short name, Ralph, and package and file names would use the 
lower-case,
ralph as in org.apache.ralph.core.interfaces.RalphException. And 
upon

graduation, the project would be housed at http://ralph.apache.org/.

Now, I wasn't seriously considering Ralph as a name for LCF, but... 
it

works for me.

-- Jack Krupansky

--
From: Karl Wright daddy...@gmail.com
Sent: Tuesday, September 21, 2010 3:16 AM
To: connectors-dev connectors-dev@incubator.apache.org
Subject: Exploring ManifoldCF ramifications


Folks,

The ManifoldCF name possibility leads to some challenges as far as our
documentation is concerned.  I thought that it might be a good idea
during the vote to explore those to see what people thought.

Here are some examples of how Apache Connectors Framework might get
used in text:

Apache Connectors Framework is an interesting offering from Apache.
ACF links repositories with search indices.  That's what ACF does.
The Apache Connectors Framework is a framework for repository
connectors primarily.

The above is not technically proper.  So instead we might conceivably
have done this:

Apache Connectors Framework is an interesting offering from Apache.
Connectors Framework links repositories with search indices.  That's
what CF does.  The Connectors Framework is a framework for repository
connectors primarily.

What is the equivalent for Apache ManifoldCF?

Apache ManifoldCF is an interesting offering from Apache.  ManifoldCF
links repositories with search indices.  That's what MCF does.
ManifoldCF is a framework for repository connectors primarily.

Note that the difference is that we would never say, The Apache
ManifoldCF...  or The Apache Manifold Connectors Framework..., just
ManifoldCF

Would we want to use the MCF abbreviation at all?  Or just convert ACF
- ManifoldCF wherever it is found in documentation?

Similarly, the handle acf in package and class names would need to
be addressed:

org.apache.acf.core.interfaces.ACFException - ?
org.apache.acf.core.system.ACF - ?

...bearing in mind that you'd better choose a consistent treatment for
uppercase ACF in both contexts.

(FWIW, my initial thought is:

org.apache.acf.core.interfaces.ACFException -
org.apache.mcf.core.interfaces.ManifoldCFException
org.apache.acf.core.system.ACF - 
org.apache.mcf.core.system.ManifoldCF)


Thoughts?

Karl







Re: Soliciting more potential names for the project formerly known as LCF

2010-09-21 Thread Jack Krupansky

I few more from my noon walk:

Recon [Re from Repository and con from connectors]
Reconn
Conx [ala Connects] - treat these four as one for initial voting, then 
separate

Conex
Connx
Connex
Reconx [ala Repository and Connects] - ditto
Reconex
Reconnx
Reconnex
Contor [Con and tor from Connector]
Contour [Contor into a word to avoid misspelling]
Contango [Connectors are a bit of a dance; a term from commodities futures 
trading]


-- Jack Krupansky

--
From: Karl Wright daddy...@gmail.com
Sent: Tuesday, September 21, 2010 4:27 PM
To: connectors-dev connectors-dev@incubator.apache.org
Subject: Soliciting more potential names for the project formerly known as 
LCF



We're going to have another round of name selection, based on feedback
from the board.  We've also pretty much decided that the last round
did not have sufficient breadth, so here's an open solicitation call
for potential names for the next round.

So hold on to your hats, and throw your suggestions into the ring.

One word of caution.  Before offering a name, please do your best to
make sure it meets the following criteria:

(1) It's a single word
(2) It's not a very commonly-used word, or a well-known place; ideally
it's not a word you would find in the OED
(3) It does not have negative connotations
(4) A quick google with it in the context of open source shows nothing
(5) No more than 4 syllables

Here's what we've got so far for this round:


Congo
Connie
Heterolink
Heterosource
Heteroweb
Manicon
Manifolio
Manilink
Maniplex
Manisource
Maniweb
Multicon
Ralph
Reconto
RepoMan
Repositor
Reptile

Some of these don't fulfill the criteria above (e.g. Congo and
Reptile), and may be stricken from the voting list as a result.  But
feel free to riff at will... ;-)

Karl 




Re: [VOTE] Pick either Apache Connectors Framework or Apache ManifoldCF

2010-09-20 Thread Jack Krupansky
We could put CF on the end of each of the original short names and re-vote 
them... but don't do that unless somebody seconds that approach.


-- Jack Krupansky

--
From: Karl Wright daddy...@gmail.com
Sent: Monday, September 20, 2010 2:33 PM
To: connectors-dev connectors-dev@incubator.apache.org
Subject: [VOTE] Pick either Apache Connectors Framework or Apache ManifoldCF


Grant's feedback from the board is that ACF would likely not be
blocked, but is indeed too broad.

Based on the popularity of Apache Manifold the last time 'round, but
the concern that Manifold is indeed a registered trademark, Grant
proposes that we consider changing the name from Apache Connectors
Framework to Apache ManifoldCF.  I'd like a vote of those who think
this is a good idea.

Vote +1 if you want this project to become Apache ManifoldCF, or -1 if
you want to keep the current name Apache Connectors Framework.

Karl 




Re: [VOTE] Pick either Apache Connectors Framework or Apache ManifoldCF

2010-09-20 Thread Jack Krupansky

+1

-- Jack Krupansky

--
From: Karl Wright daddy...@gmail.com
Sent: Monday, September 20, 2010 3:23 PM
To: connectors-dev@incubator.apache.org
Subject: Re: [VOTE] Pick either Apache Connectors Framework or Apache 
ManifoldCF



Based on the level of controversy, and the necessity of settling this
promptly, once and for all, I vote +1.

Karl 




Re: [jira] Commented: (CONNECTORS-98) API should be pure RESTful with the API verb represented using the HTTP GET/PUT/POST/DELETE methods

2010-09-13 Thread Jack Krupansky
I briefly reviewed the proposal wiki and it looks good good enough to move 
forward. There may be revisions as we actually start using it, but this is 
definitely a big step in the right direction.


-- Jack Krupansky

--
From: Karl Wright daddy...@gmail.com
Sent: Monday, September 13, 2010 4:30 AM
To: connectors-dev@incubator.apache.org
Subject: Re: [jira] Commented: (CONNECTORS-98) API should be pure RESTful 
with the API verb represented using the HTTP GET/PUT/POST/DELETE methods



I have updated the wiki proposal document.  I now have working code
consistent with that implementation, which I will check in as soon as you
confirm that you are happy with the design, and when I have tested it 
more.


Karl

On Sun, Sep 12, 2010 at 8:28 PM, Jack Krupansky (JIRA) 
j...@apache.orgwrote:




   [
https://issues.apache.org/jira/browse/CONNECTORS-98?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12908581#action_12908581]

Jack Krupansky commented on CONNECTORS-98:
--

Just to confirm, as requested, that I am comfortable sticking with
connection name (and job name, etc.) in API paths as opposed to using a 
more

abstract id since we seem to have an encoding convention to deal with
slash so that an ACF object name can always be represented using a single
HTTP path segment. Names clearly feel more natural and will be easier to
use, both for app code using the ACF API and for CURL and other scripting
tools.




 API should be pure RESTful with the API verb represented using the 
 HTTP

GET/PUT/POST/DELETE methods

-

 Key: CONNECTORS-98
 URL: 
 https://issues.apache.org/jira/browse/CONNECTORS-98

 Project: Apache Connectors Framework
  Issue Type: Improvement
  Components: API
Affects Versions: LCF Release 0.5
Reporter: Jack Krupansky
 Fix For: LCF Release 0.5


 (This was originally a comment on CONNECTORS-56 dated 7/16/2010.)
 It has come to my attention that the API would be more pure RESTful 
 if
the API verb was represented using the HTTP GET/PUT/POST/DELETE methods 
and

the input argument identifier represented in the context path.
 So,  GET outputconnection/get \{connection_name:_connection_name_\}
would be GET outputconnections/connection_name
 and GET outputconnection/delete 
 \{connection_name:_connection_name_\}

would be DELETE outputconnections/connection_name
 and GET outputconnection/list would be GET outputconnections
 and PUT outputconnection/save
\{outputconnection:_output_connection_object_\} would be PUT
outputconnections/connection_name
\{outputconnection:_output_connection_object_\}
 What we have today is certainly workable, but just not as pure as 
 some

might desire. It would be better to take care of this before the initial
release so that we never have to answer the question of why it wasn't 
done

as a proper RESTful API.
 BTW, I did check to verify that an HttpServlet running under Jetty can
process the DELETE and PUT methods (using the doDelete and doPut method
overrides.)
 Also, POST should be usable as an alternative to PUT for API calls that
have large volumes of data.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.






Re: [RESULT][VOTE] Pick your preferred name

2010-09-13 Thread Jack Krupansky
+1 to both - review of name and address the NTLM issue since ACF is getting 
closer to where an actual 0.1 release could be considered.


-- Jack Krupansky

--
From: Grant Ingersoll grant.ingers...@gmail.com
Sent: Monday, September 13, 2010 1:35 PM
To: connectors-dev@incubator.apache.org
Subject: Re: [RESULT][VOTE] Pick your preferred name


ACF passed the Incubator vote.

My question to the community is do you want me to go to the Board and ask 
for advice on this since the Board ultimately approves any podling 
graduating?  One Director weighed in on the vote saying the Board wouldn't 
care, but in my view it was not an official opinion.


I was actually thinking about asking the board for two things:
1. View of the name
2. Whether they have guidance on our repeated request  about NTLM and it's 
inclusion in any ACF release.  I believe someone was slated to engage with 
us a few months back, but I don't believe anyone has reached out to us 
yet.


Thoughts?

-Grant

On Sep 7, 2010, at 4:54 AM, Karl Wright wrote:


Voting is now closed.

Final tally (which only counts Robert's first choice and not all three):

Apache Connectors Framework 15
Apache Manifold 11
Apache Yukon 9
Apache Macon 4
Apache ManifoldCF 3
Apache Omni 1
Apache Acromantula 1
Apache Lukon 1

Karl


--
Grant Ingersoll
http://lucenerevolution.org Apache Lucene/Solr Conference, Boston Oct 7-8



[jira] Commented: (CONNECTORS-98) API should be pure RESTful with the API verb represented using the HTTP GET/PUT/POST/DELETE methods

2010-09-13 Thread Jack Krupansky (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-98?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12909036#action_12909036
 ] 

Jack Krupansky commented on CONNECTORS-98:
--

Looks good. This meets meets my expectations. Any further tweaks that might 
arise would be distinct Jira issues.

 API should be pure RESTful with the API verb represented using the HTTP 
 GET/PUT/POST/DELETE methods
 -

 Key: CONNECTORS-98
 URL: https://issues.apache.org/jira/browse/CONNECTORS-98
 Project: Apache Connectors Framework
  Issue Type: Improvement
  Components: API
Affects Versions: LCF Release 0.5
Reporter: Jack Krupansky
 Fix For: LCF Release 0.5


 (This was originally a comment on CONNECTORS-56 dated 7/16/2010.)
 It has come to my attention that the API would be more pure RESTful if the 
 API verb was represented using the HTTP GET/PUT/POST/DELETE methods and the 
 input argument identifier represented in the context path.
 So,  GET outputconnection/get \{connection_name:_connection_name_\} would 
 be GET outputconnections/connection_name
 and GET outputconnection/delete \{connection_name:_connection_name_\} 
 would be DELETE outputconnections/connection_name
 and GET outputconnection/list would be GET outputconnections
 and PUT outputconnection/save 
 \{outputconnection:_output_connection_object_\} would be PUT 
 outputconnections/connection_name 
 \{outputconnection:_output_connection_object_\}
 What we have today is certainly workable, but just not as pure as some 
 might desire. It would be better to take care of this before the initial 
 release so that we never have to answer the question of why it wasn't done as 
 a proper RESTful API.
 BTW, I did check to verify that an HttpServlet running under Jetty can 
 process the DELETE and PUT methods (using the doDelete and doPut method 
 overrides.)
 Also, POST should be usable as an alternative to PUT for API calls that have 
 large volumes of data.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (CONNECTORS-98) API should be pure RESTful with the API verb represented using the HTTP GET/PUT/POST/DELETE methods

2010-09-12 Thread Jack Krupansky (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-98?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12908581#action_12908581
 ] 

Jack Krupansky commented on CONNECTORS-98:
--

Just to confirm, as requested, that I am comfortable sticking with connection 
name (and job name, etc.) in API paths as opposed to using a more abstract id 
since we seem to have an encoding convention to deal with slash so that an ACF 
object name can always be represented using a single HTTP path segment. Names 
clearly feel more natural and will be easier to use, both for app code using 
the ACF API and for CURL and other scripting tools.




 API should be pure RESTful with the API verb represented using the HTTP 
 GET/PUT/POST/DELETE methods
 -

 Key: CONNECTORS-98
 URL: https://issues.apache.org/jira/browse/CONNECTORS-98
 Project: Apache Connectors Framework
  Issue Type: Improvement
  Components: API
Affects Versions: LCF Release 0.5
Reporter: Jack Krupansky
 Fix For: LCF Release 0.5


 (This was originally a comment on CONNECTORS-56 dated 7/16/2010.)
 It has come to my attention that the API would be more pure RESTful if the 
 API verb was represented using the HTTP GET/PUT/POST/DELETE methods and the 
 input argument identifier represented in the context path.
 So,  GET outputconnection/get \{connection_name:_connection_name_\} would 
 be GET outputconnections/connection_name
 and GET outputconnection/delete \{connection_name:_connection_name_\} 
 would be DELETE outputconnections/connection_name
 and GET outputconnection/list would be GET outputconnections
 and PUT outputconnection/save 
 \{outputconnection:_output_connection_object_\} would be PUT 
 outputconnections/connection_name 
 \{outputconnection:_output_connection_object_\}
 What we have today is certainly workable, but just not as pure as some 
 might desire. It would be better to take care of this before the initial 
 release so that we never have to answer the question of why it wasn't done as 
 a proper RESTful API.
 BTW, I did check to verify that an HttpServlet running under Jetty can 
 process the DELETE and PUT methods (using the doDelete and doPut method 
 overrides.)
 Also, POST should be usable as an alternative to PUT for API calls that have 
 large volumes of data.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (CONNECTORS-98) API should be pure RESTful with the API verb represented using the HTTP GET/PUT/POST/DELETE methods

2010-09-10 Thread Jack Krupansky (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-98?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12908148#action_12908148
 ] 

Jack Krupansky commented on CONNECTORS-98:
--

I am still pondering this embedded slash issue and checking into some things 
related to it. Maybe Monday I'll have something more concrete to say.

For example, I want to make sure I understand the rules for what a path can 
have in it in a URI and whether simply placing a name at the tail of the path 
means it can have slashes or other reserved characters in it. My model is that 
a name should occupy only a single path component.


 API should be pure RESTful with the API verb represented using the HTTP 
 GET/PUT/POST/DELETE methods
 -

 Key: CONNECTORS-98
 URL: https://issues.apache.org/jira/browse/CONNECTORS-98
 Project: Apache Connectors Framework
  Issue Type: Improvement
  Components: API
Affects Versions: LCF Release 0.5
Reporter: Jack Krupansky
 Fix For: LCF Release 0.5


 (This was originally a comment on CONNECTORS-56 dated 7/16/2010.)
 It has come to my attention that the API would be more pure RESTful if the 
 API verb was represented using the HTTP GET/PUT/POST/DELETE methods and the 
 input argument identifier represented in the context path.
 So,  GET outputconnection/get \{connection_name:_connection_name_\} would 
 be GET outputconnections/connection_name
 and GET outputconnection/delete \{connection_name:_connection_name_\} 
 would be DELETE outputconnections/connection_name
 and GET outputconnection/list would be GET outputconnections
 and PUT outputconnection/save 
 \{outputconnection:_output_connection_object_\} would be PUT 
 outputconnections/connection_name 
 \{outputconnection:_output_connection_object_\}
 What we have today is certainly workable, but just not as pure as some 
 might desire. It would be better to take care of this before the initial 
 release so that we never have to answer the question of why it wasn't done as 
 a proper RESTful API.
 BTW, I did check to verify that an HttpServlet running under Jetty can 
 process the DELETE and PUT methods (using the doDelete and doPut method 
 overrides.)
 Also, POST should be usable as an alternative to PUT for API calls that have 
 large volumes of data.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (CONNECTORS-98) API should be pure RESTful with the API verb represented using the HTTP GET/PUT/POST/DELETE methods

2010-09-09 Thread Jack Krupansky (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-98?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12907614#action_12907614
 ] 

Jack Krupansky commented on CONNECTORS-98:
--

I have looked at the code a bit but not made any actual progress at a patch, so 
you can go ahead and take a crack at it. Yes, I'll do the transformation table. 
As far as updating the wiki, do I have privileges to do that?


 API should be pure RESTful with the API verb represented using the HTTP 
 GET/PUT/POST/DELETE methods
 -

 Key: CONNECTORS-98
 URL: https://issues.apache.org/jira/browse/CONNECTORS-98
 Project: Apache Connectors Framework
  Issue Type: Improvement
  Components: API
Affects Versions: LCF Release 0.5
Reporter: Jack Krupansky
 Fix For: LCF Release 0.5


 (This was originally a comment on CONNECTORS-56 dated 7/16/2010.)
 It has come to my attention that the API would be more pure RESTful if the 
 API verb was represented using the HTTP GET/PUT/POST/DELETE methods and the 
 input argument identifier represented in the context path.
 So,  GET outputconnection/get \{connection_name:_connection_name_\} would 
 be GET outputconnections/connection_name
 and GET outputconnection/delete \{connection_name:_connection_name_\} 
 would be DELETE outputconnections/connection_name
 and GET outputconnection/list would be GET outputconnections
 and PUT outputconnection/save 
 \{outputconnection:_output_connection_object_\} would be PUT 
 outputconnections/connection_name 
 \{outputconnection:_output_connection_object_\}
 What we have today is certainly workable, but just not as pure as some 
 might desire. It would be better to take care of this before the initial 
 release so that we never have to answer the question of why it wasn't done as 
 a proper RESTful API.
 BTW, I did check to verify that an HttpServlet running under Jetty can 
 process the DELETE and PUT methods (using the doDelete and doPut method 
 overrides.)
 Also, POST should be usable as an alternative to PUT for API calls that have 
 large volumes of data.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (CONNECTORS-98) API should be pure RESTful with the API verb represented using the HTTP GET/PUT/POST/DELETE methods

2010-09-09 Thread Jack Krupansky (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-98?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12907712#action_12907712
 ] 

Jack Krupansky commented on CONNECTORS-98:
--

Some RESTful resource doc:

http://en.wikipedia.org/wiki/Representational_State_Transfer

http://www.xfront.com/REST-Web-Services.html

http://www.oracle.com/technetwork/articles/javase/table3-138001.html

The idea of using a plural is that it is the name of the collection and the 
qualifier (name or argument object) provides the specificity.


 API should be pure RESTful with the API verb represented using the HTTP 
 GET/PUT/POST/DELETE methods
 -

 Key: CONNECTORS-98
 URL: https://issues.apache.org/jira/browse/CONNECTORS-98
 Project: Apache Connectors Framework
  Issue Type: Improvement
  Components: API
Affects Versions: LCF Release 0.5
Reporter: Jack Krupansky
 Fix For: LCF Release 0.5


 (This was originally a comment on CONNECTORS-56 dated 7/16/2010.)
 It has come to my attention that the API would be more pure RESTful if the 
 API verb was represented using the HTTP GET/PUT/POST/DELETE methods and the 
 input argument identifier represented in the context path.
 So,  GET outputconnection/get \{connection_name:_connection_name_\} would 
 be GET outputconnections/connection_name
 and GET outputconnection/delete \{connection_name:_connection_name_\} 
 would be DELETE outputconnections/connection_name
 and GET outputconnection/list would be GET outputconnections
 and PUT outputconnection/save 
 \{outputconnection:_output_connection_object_\} would be PUT 
 outputconnections/connection_name 
 \{outputconnection:_output_connection_object_\}
 What we have today is certainly workable, but just not as pure as some 
 might desire. It would be better to take care of this before the initial 
 release so that we never have to answer the question of why it wasn't done as 
 a proper RESTful API.
 BTW, I did check to verify that an HttpServlet running under Jetty can 
 process the DELETE and PUT methods (using the doDelete and doPut method 
 overrides.)
 Also, POST should be usable as an alternative to PUT for API calls that have 
 large volumes of data.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (CONNECTORS-98) API should be pure RESTful with the API verb represented using the HTTP GET/PUT/POST/DELETE methods

2010-09-09 Thread Jack Krupansky (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-98?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12907735#action_12907735
 ] 

Jack Krupansky commented on CONNECTORS-98:
--

I think status is probably technically okay since it is disambiguated by number 
path elements, but it could be moved to the end:

 GET outputconnections/connection_name/status ()

vs.

 GET outputconnections/status/connection_name ()

Same for execute/request:

GET outputconnections/connection_name/request/command (arguments)

vs.

GET outputconnections/request/connection_name/command (arguments)


That way the connection name is always in the same position.

So, I'd revise my counter-proposal that way.


 API should be pure RESTful with the API verb represented using the HTTP 
 GET/PUT/POST/DELETE methods
 -

 Key: CONNECTORS-98
 URL: https://issues.apache.org/jira/browse/CONNECTORS-98
 Project: Apache Connectors Framework
  Issue Type: Improvement
  Components: API
Affects Versions: LCF Release 0.5
Reporter: Jack Krupansky
 Fix For: LCF Release 0.5


 (This was originally a comment on CONNECTORS-56 dated 7/16/2010.)
 It has come to my attention that the API would be more pure RESTful if the 
 API verb was represented using the HTTP GET/PUT/POST/DELETE methods and the 
 input argument identifier represented in the context path.
 So,  GET outputconnection/get \{connection_name:_connection_name_\} would 
 be GET outputconnections/connection_name
 and GET outputconnection/delete \{connection_name:_connection_name_\} 
 would be DELETE outputconnections/connection_name
 and GET outputconnection/list would be GET outputconnections
 and PUT outputconnection/save 
 \{outputconnection:_output_connection_object_\} would be PUT 
 outputconnections/connection_name 
 \{outputconnection:_output_connection_object_\}
 What we have today is certainly workable, but just not as pure as some 
 might desire. It would be better to take care of this before the initial 
 release so that we never have to answer the question of why it wasn't done as 
 a proper RESTful API.
 BTW, I did check to verify that an HttpServlet running under Jetty can 
 process the DELETE and PUT methods (using the doDelete and doPut method 
 overrides.)
 Also, POST should be usable as an alternative to PUT for API calls that have 
 large volumes of data.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (CONNECTORS-98) API should be pure RESTful with the API verb represented using the HTTP GET/PUT/POST/DELETE methods

2010-09-09 Thread Jack Krupansky (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-98?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12907736#action_12907736
 ] 

Jack Krupansky commented on CONNECTORS-98:
--

re: We could not pass (arguments) except as part of the path.

Sure, we could go that route, and list the arguments as path elements, but I 
think a JSON object (array list of arguments) is acceptable.

So, I'd go with the latter (JSON.)


 API should be pure RESTful with the API verb represented using the HTTP 
 GET/PUT/POST/DELETE methods
 -

 Key: CONNECTORS-98
 URL: https://issues.apache.org/jira/browse/CONNECTORS-98
 Project: Apache Connectors Framework
  Issue Type: Improvement
  Components: API
Affects Versions: LCF Release 0.5
Reporter: Jack Krupansky
 Fix For: LCF Release 0.5


 (This was originally a comment on CONNECTORS-56 dated 7/16/2010.)
 It has come to my attention that the API would be more pure RESTful if the 
 API verb was represented using the HTTP GET/PUT/POST/DELETE methods and the 
 input argument identifier represented in the context path.
 So,  GET outputconnection/get \{connection_name:_connection_name_\} would 
 be GET outputconnections/connection_name
 and GET outputconnection/delete \{connection_name:_connection_name_\} 
 would be DELETE outputconnections/connection_name
 and GET outputconnection/list would be GET outputconnections
 and PUT outputconnection/save 
 \{outputconnection:_output_connection_object_\} would be PUT 
 outputconnections/connection_name 
 \{outputconnection:_output_connection_object_\}
 What we have today is certainly workable, but just not as pure as some 
 might desire. It would be better to take care of this before the initial 
 release so that we never have to answer the question of why it wasn't done as 
 a proper RESTful API.
 BTW, I did check to verify that an HttpServlet running under Jetty can 
 process the DELETE and PUT methods (using the doDelete and doPut method 
 overrides.)
 Also, POST should be usable as an alternative to PUT for API calls that have 
 large volumes of data.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (CONNECTORS-98) API should be pure RESTful with the API verb represented using the HTTP GET/PUT/POST/DELETE methods

2010-09-09 Thread Jack Krupansky (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-98?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12907758#action_12907758
 ] 

Jack Krupansky commented on CONNECTORS-98:
--

re:  the command cannot itself contain / characters, or it won't be 
uniquely parseable

Elsewhere I noted that URI-reserved characters need to be encoded with the % 
notation, so this is not a fatal problem.


  reserved= ; | / | ? | : | @ |  | = | + | $ | ,


 API should be pure RESTful with the API verb represented using the HTTP 
 GET/PUT/POST/DELETE methods
 -

 Key: CONNECTORS-98
 URL: https://issues.apache.org/jira/browse/CONNECTORS-98
 Project: Apache Connectors Framework
  Issue Type: Improvement
  Components: API
Affects Versions: LCF Release 0.5
Reporter: Jack Krupansky
 Fix For: LCF Release 0.5


 (This was originally a comment on CONNECTORS-56 dated 7/16/2010.)
 It has come to my attention that the API would be more pure RESTful if the 
 API verb was represented using the HTTP GET/PUT/POST/DELETE methods and the 
 input argument identifier represented in the context path.
 So,  GET outputconnection/get \{connection_name:_connection_name_\} would 
 be GET outputconnections/connection_name
 and GET outputconnection/delete \{connection_name:_connection_name_\} 
 would be DELETE outputconnections/connection_name
 and GET outputconnection/list would be GET outputconnections
 and PUT outputconnection/save 
 \{outputconnection:_output_connection_object_\} would be PUT 
 outputconnections/connection_name 
 \{outputconnection:_output_connection_object_\}
 What we have today is certainly workable, but just not as pure as some 
 might desire. It would be better to take care of this before the initial 
 release so that we never have to answer the question of why it wasn't done as 
 a proper RESTful API.
 BTW, I did check to verify that an HttpServlet running under Jetty can 
 process the DELETE and PUT methods (using the doDelete and doPut method 
 overrides.)
 Also, POST should be usable as an alternative to PUT for API calls that have 
 large volumes of data.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (CONNECTORS-98) API should be pure RESTful with the API verb represented using the HTTP GET/PUT/POST/DELETE methods

2010-09-09 Thread Jack Krupansky (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-98?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12907875#action_12907875
 ] 

Jack Krupansky commented on CONNECTORS-98:
--

It makes sense that GetPathInfo would have removed escapes from the URL. So, 
either we don't use % escaping or bypass GetPathInfo and manually decode.

Maybe we could use backslash for escaping. I'm not sure whether it needs to be 
% escaped as well.

This is only needed if the user has one of the reserved special characters in a 
name. It would be an issue if it was something that users commonly needed, but 
it seems like more of an edge case rather than a common case.

Encourage people to use alphanumeric, -, and _ for names and it won't be an 
issue for them.

And, the real point of the API is access from code. We can provide helper 
functions for working with names and building API paths.



 API should be pure RESTful with the API verb represented using the HTTP 
 GET/PUT/POST/DELETE methods
 -

 Key: CONNECTORS-98
 URL: https://issues.apache.org/jira/browse/CONNECTORS-98
 Project: Apache Connectors Framework
  Issue Type: Improvement
  Components: API
Affects Versions: LCF Release 0.5
Reporter: Jack Krupansky
 Fix For: LCF Release 0.5


 (This was originally a comment on CONNECTORS-56 dated 7/16/2010.)
 It has come to my attention that the API would be more pure RESTful if the 
 API verb was represented using the HTTP GET/PUT/POST/DELETE methods and the 
 input argument identifier represented in the context path.
 So,  GET outputconnection/get \{connection_name:_connection_name_\} would 
 be GET outputconnections/connection_name
 and GET outputconnection/delete \{connection_name:_connection_name_\} 
 would be DELETE outputconnections/connection_name
 and GET outputconnection/list would be GET outputconnections
 and PUT outputconnection/save 
 \{outputconnection:_output_connection_object_\} would be PUT 
 outputconnections/connection_name 
 \{outputconnection:_output_connection_object_\}
 What we have today is certainly workable, but just not as pure as some 
 might desire. It would be better to take care of this before the initial 
 release so that we never have to answer the question of why it wasn't done as 
 a proper RESTful API.
 BTW, I did check to verify that an HttpServlet running under Jetty can 
 process the DELETE and PUT methods (using the doDelete and doPut method 
 overrides.)
 Also, POST should be usable as an alternative to PUT for API calls that have 
 large volumes of data.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (CONNECTORS-104) Make it easier to limit a web crawl to a single site

2010-09-08 Thread Jack Krupansky (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12907201#action_12907201
 ] 

Jack Krupansky commented on CONNECTORS-104:
---

Simple works best. This enhancement is primarily for the simple use case where 
a novice user tries to do what they think is obvious (crawl the web pages at 
this URL), but without considering all of the potential nuances or how to 
fully specify the details of their goal.

One nuance is whether subdomains are considered part of the domain. I would say 
no if a subdomain was specified by the user and yes if no subdomain was 
specified.

Another nuance is whether a path is specified to select a subset of a domain. 
It would be nice to handle that and (optionally) limit the crawl to that path 
(or sub-paths below it). An example would be to crawl the news archive for a 
site.


 Make it easier to limit a web crawl to a single site
 

 Key: CONNECTORS-104
 URL: https://issues.apache.org/jira/browse/CONNECTORS-104
 Project: Apache Connectors Framework
  Issue Type: Improvement
  Components: Web connector
Reporter: Jack Krupansky
Priority: Minor

 Unless the user explicitly enters an include regex carefully, a web crawl can 
 quickly get out of control and start crawling the entire web when all the 
 user may really want is to crawl just a single web site or portion thereof. 
 So, it would be preferable if either by default or with a simple button the 
 crawl could be limited to the seed web site(s).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (CONNECTORS-104) Make it easier to limit a web crawl to a single site

2010-09-07 Thread Jack Krupansky (JIRA)
Make it easier to limit a web crawl to a single site


 Key: CONNECTORS-104
 URL: https://issues.apache.org/jira/browse/CONNECTORS-104
 Project: Apache Connectors Framework
  Issue Type: Improvement
  Components: Web connector
Affects Versions: LCF Release 0.5
Reporter: Jack Krupansky
Priority: Minor
 Fix For: LCF Release 0.5


Unless the user explicitly enters an include regex carefully, a web crawl can 
quickly get out of control and start crawling the entire web when all the user 
may really want is to crawl just a single web site or portion thereof. So, it 
would be preferable if either by default or with a simple button the crawl 
could be limited to the seed web site(s).


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Re: About name change -- Macon

2010-08-31 Thread Jack Krupansky
How about Macon... from Mac[hinery] + con[nection]. A small city, also a 
dirigible airship.


-- Jack Krupansky

--
From: Grant Ingersoll gsing...@apache.org
Sent: Tuesday, August 31, 2010 6:46 AM
To: connectors-dev@incubator.apache.org
Subject: Re: About name change

Apache Manifold is growing on me.  And/or Apache Manifold CF or Apache 
Manifold Conn. Framework.


Has a nice short name, easy to pronounce, doesn't require funky acronyms 
and from Webster's:
Machinery . a chamber having several outlets through which a liquid or 
gas is distributed or gathered.  --  
http://dictionary.reference.com/browse/Manifold


Paraphrased, it's a a chamber having several outlets through which bits 
are gathered and distributed.


-Grant


On Aug 30, 2010, at 5:57 PM, Mark Miller wrote:


On 8/30/10 5:20 PM, Karl Wright wrote:

I'm not going to go head-to-head with you trying to split hairs. ;-)
Can we agree that something like ContentCF is a possibility under your
guidelines?  (I'm not proposing that, I'm just trying to open the field 
up a

bit.)

Karl



From my end, most of that was off topic haggling - I'm not saying it
should be one way or other per seh. I personally see the benefit of
having a good unique word in the name of the project - and of trying to
follow the guidelines / feel of previous projects. I'd be perfectly fine
with something like Apache Manifold Connector Framework. But push come
to shove I wouldn't even vote against keeping things as is with the
Apache Connector Framework.

- Mark


--
Grant Ingersoll
http://lucenerevolution.org Apache Lucene/Solr Conference, Boston Oct 7-8



Re: About name change -- Acromantula

2010-08-31 Thread Jack Krupansky
Brand names are better when they are less purely descriptive or simply 
indirectly descriptive.


I'm okay with Acromantula if people want it, especially since it seems to 
adhere to all the Apache guidelines, but since I am unable to pronounce it, 
I would be able to promote it via word of mouth!


Or, might J. K. Rowling sue us for stealing her work?

-- Jack Krupansky

--
From: Karl Wright daddy...@gmail.com
Sent: Tuesday, August 31, 2010 8:01 AM
To: connectors-dev@incubator.apache.org
Subject: Re: About name change -- Macon


I don't find any obvious software uses of the name.  I don't find it
terribly descriptive though - multiplex/manifold wins in my opinion.
Acromantula is also available and is more descriptive:

http://harrypotter.wikia.com/wiki/Acromantula

(if you don't mind the HP references. ;-) )

Karl



On Tue, Aug 31, 2010 at 7:52 AM, Jack Krupansky 
jack.krupan...@lucidimagination.com wrote:


How about Macon... from Mac[hinery] + con[nection]. A small city, also a
dirigible airship.

-- Jack Krupansky

--
From: Grant Ingersoll gsing...@apache.org
Sent: Tuesday, August 31, 2010 6:46 AM
To: connectors-dev@incubator.apache.org
Subject: Re: About name change

 Apache Manifold is growing on me.  And/or Apache Manifold CF or Apache

Manifold Conn. Framework.

Has a nice short name, easy to pronounce, doesn't require funky acronyms
and from Webster's:
Machinery . a chamber having several outlets through which a liquid or
gas is distributed or gathered.  --
http://dictionary.reference.com/browse/Manifold

Paraphrased, it's a a chamber having several outlets through which bits
are gathered and distributed.

-Grant


On Aug 30, 2010, at 5:57 PM, Mark Miller wrote:

 On 8/30/10 5:20 PM, Karl Wright wrote:



I'm not going to go head-to-head with you trying to split hairs. ;-)
Can we agree that something like ContentCF is a possibility under 
your
guidelines?  (I'm not proposing that, I'm just trying to open the 
field

up a
bit.)

Karl



From my end, most of that was off topic haggling - I'm not saying it
should be one way or other per seh. I personally see the benefit of
having a good unique word in the name of the project - and of trying to
follow the guidelines / feel of previous projects. I'd be perfectly 
fine

with something like Apache Manifold Connector Framework. But push come
to shove I wouldn't even vote against keeping things as is with the
Apache Connector Framework.

- Mark



--
Grant Ingersoll
http://lucenerevolution.org Apache Lucene/Solr Conference, Boston Oct 
7-8







[jira] Commented: (CONNECTORS-57) Solr output connector option to commit at end of job, by default

2010-08-31 Thread Jack Krupansky (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-57?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12904746#action_12904746
 ] 

Jack Krupansky commented on CONNECTORS-57:
--

This looks fine so far and should work for me.

If I understand the code, the Connector.noteJobComplete method is called when 
the job completes or is aborted and the SolrConnector.noteJobComplete 
implementation method unconditionally does a commit. That's fine my my use 
case, but we probably still want a connection option to disable that commit if 
the user has some other commit strategy in mind.

 Solr output connector option to commit at end of job, by default
 

 Key: CONNECTORS-57
 URL: https://issues.apache.org/jira/browse/CONNECTORS-57
 Project: Apache Connectors Framework
  Issue Type: Sub-task
  Components: Lucene/SOLR connector
Reporter: Jack Krupansky

 By default, Solr will eventually commit documents that have been submitted to 
 the Solr Cell interface, but the time lag can confuse and annoy people. 
 Although commit strategy is a difficult issue in general, an option in LCF to 
 automatically commit at the end of a job, by default, would eliminate a lot 
 of potential confusion and generally be close to what the user needs.
 The desired feature is that there be an option to commit for each job that 
 uses the Solr output connector. This option would default to on (or a 
 different setting based on some global configuration setting), but the user 
 may turn it off if commit is only desired upon completion of some jobs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Re: [VOTE] Pick your preferred name

2010-08-31 Thread Jack Krupansky

1. Apache Yukon
2. Apache Macon
3. Apache Lukon

-- Jack Krupansky

--
From: Karl Wright daddy...@gmail.com
Sent: Tuesday, August 31, 2010 6:44 PM
To: connectors-dev connectors-dev@incubator.apache.org
Subject: [VOTE] Pick your preferred name

I know this is un-Apache-like, but please respond to the following list 
with

a selection, in order, of the top three names for the project currently
known as Apache Connectors Framework.  The choices
are:

Apache Connectors Framework
Apache Acromantula
Apache Manifold
Apache ManifoldCF
Apache Multiplex
Apache Lucon
Apache Lukon
Apache Yukon
Apache Macon
Apache Omni
Apache Omnivore
Apache CMCF (yes, I just invented that one ;-) )
Apache Multivore (yes, I just invented that one too. ;-) )

I don't think I missed any?  If I did, chastise me severely please. ;-)

Karl



Re: About name change

2010-08-30 Thread Jack Krupansky
I think the first order of business should be to decide whether the name is 
going to be descriptive or abstract. Exactly what that abstract name or 
descriptive name is should be the second order of business, I think. Some 
might disagree, but I don't think the first decision should be predicated on 
the exact list of name choices for the second decision.


Should there be a vote on whether to vote for abstract vs. descriptive or 
just proceed to vote directly?


-- Jack Krupansky

--
From: Grant Ingersoll gsing...@apache.org
Sent: Monday, August 30, 2010 12:50 PM
To: connectors-dev@incubator.apache.org
Subject: Re: About name change

So, there were some other suggestions on the Incubator list.  What do 
people think of the Open Connector Framework?  OCF?  (Granted, it is silly 
to me given it will be the Apache Open Conn. Framework, which still 
implies it is the Apache one.)


Any other suggestions?


On Aug 26, 2010, at 9:04 AM, Jack Krupansky wrote:

Personally, I'd rather see a traditional, Apache-style name, but I can 
certainly live with whatever the PMC (?) endorses.


I agree with the general@ criticism that the ACF name comes across as 
being the ultimate end-all connector framework for Apache land (land 
grab). We should acknowledge that in the future there might be other 
projects that seek to offer connector frameworks in Apache land. There 
really should be a handle to qualify the purely descriptive portion of 
the name - and we had one: Lucene, but it wasn't unique and even there 
did not acknowledge that in the future there could be other connector 
frameworks.


Note: We effectively have a handle name today: LCF or ACF, but it is a 
distinctly non-Apache style of name. Why not go with an Apache-style 
name. That said, I do see that there are a minority of Apache Projects 
that have descriptive names, including HttpComponents, OpenWebBeans, 
TrafficServer, Web Services, XML Graphics. Well, there is also HTTP 
Server as well, but that is an anomaly since it is really just the 
original Apache itself. Maybe the question is what the current consensus 
preference is in Apache land and trying to go with the flow rather than 
try to go against the flow.


In short, even if Connectors Framework remains the tail end of the 
name, a handle prefix is needed. Apache is the general prefix for ALL 
Apache projects and not a handle for any of them. If that handle is 
Connecto, the full name could be Connecto Connectors Framework, and 
the official project name would be Apache Connecto Connectors 
Framework. That said, I am not a fan of trying to put the project 
description into the name in raw English form. So, my preference there 
would be to drop Connectors Framework from the name and stick with 
Connecto, or whatever other handle is chosen.


As I said, I will defer to the PMC (?) endorses, but I would hope that 
there is some consistency with current and traditional Apache project 
naming conventions.


-- Jack Krupansky

--
From: Simon Willnauer simon.willna...@googlemail.com
Sent: Thursday, August 26, 2010 7:50 AM
To: Grant Ingersoll gsing...@apache.org
Cc: connectors-dev@incubator.apache.org
Subject: Re: About name change

On Thu, Aug 26, 2010 at 12:42 PM, Grant Ingersoll gsing...@apache.org 
wrote:


On Aug 26, 2010, at 6:14 AM, Karl Wright wrote:


Is it clear that ACF is dead?  The concern raised was that it implied
something that connected lots of stuff together, and that's not what 
it
was.  But I think that that IS what it is, so the poster knew little 
or
nothing about the project, and was operating from ignorance.  Does it 
make

sense to clarify what ACF does to the general list first?


I think it is worthwhile.  You want to take a crack at it?

Absolutely +1 - I just have the impression that people are already
biased by Tomcat Connector etc. but I will be a supporter of Apache
Connector FW, no doubt. If it is not an option we can still discuss
here!

simon




Karl

On Thu, Aug 26, 2010 at 5:26 AM, Simon Willnauer 
simon.willna...@googlemail.com wrote:


Hey folks,

I was following the discussion about changing the name to Apache
Connector Framework and the late response from people on gene...@.
Obviously we need to decide on something else than Apache Connectors
Framework since many people had concerns about the name and possible
confusion. I have the impression we should first collect some
suggestions about alternative names here before we continue 
discussion

on the gene...@. Once we have a name we all agreed on and doesn't
apply to the concerns others had we should go back and discuss
further.
Some folks suggested a more abstract name like Apache Connecto which 
I

personally like (not necessarily Connecto but a more abstract name.
Such names have many advantages as people remember short names and
they are less ambiguous.

Any suggestions, thoughts?

simon

Re: About name change

2010-08-30 Thread Jack Krupansky
I meant decide the abstract vs. descriptive issue first. Whether we need to 
decide to vote whether to hold a vote on that or just vote immediately on 
the abstract vs. descriptive question. Either way is fine with me. I'd 
prefer to hold off on deciding the exact name until the abstract vs. 
descriptive issue is resolved.


-- Jack Krupansky

--
From: Karl Wright daddy...@gmail.com
Sent: Monday, August 30, 2010 3:20 PM
To: connectors-dev@incubator.apache.org
Subject: Re: About name change

I think we should vote directly.  Perhaps we can save time by supplying 
our

top three choices, in order.

Karl


On Mon, Aug 30, 2010 at 2:12 PM, Jack Krupansky 
jack.krupan...@lucidimagination.com wrote:

I think the first order of business should be to decide whether the name 
is

going to be descriptive or abstract. Exactly what that abstract name or
descriptive name is should be the second order of business, I think. Some
might disagree, but I don't think the first decision should be predicated 
on

the exact list of name choices for the second decision.

Should there be a vote on whether to vote for abstract vs. descriptive or
just proceed to vote directly?

-- Jack Krupansky

--
From: Grant Ingersoll gsing...@apache.org
Sent: Monday, August 30, 2010 12:50 PM
To: connectors-dev@incubator.apache.org

Subject: Re: About name change

 So, there were some other suggestions on the Incubator list.  What do
people think of the Open Connector Framework?  OCF?  (Granted, it is 
silly
to me given it will be the Apache Open Conn. Framework, which still 
implies

it is the Apache one.)

Any other suggestions?


On Aug 26, 2010, at 9:04 AM, Jack Krupansky wrote:

 Personally, I'd rather see a traditional, Apache-style name, but I can

certainly live with whatever the PMC (?) endorses.

I agree with the general@ criticism that the ACF name comes across as
being the ultimate end-all connector framework for Apache land (land
grab). We should acknowledge that in the future there might be other
projects that seek to offer connector frameworks in Apache land. 
There
really should be a handle to qualify the purely descriptive portion 
of the
name - and we had one: Lucene, but it wasn't unique and even there did 
not
acknowledge that in the future there could be other connector 
frameworks.


Note: We effectively have a handle name today: LCF or ACF, but it is 
a
distinctly non-Apache style of name. Why not go with an Apache-style 
name.
That said, I do see that there are a minority of Apache Projects that 
have
descriptive names, including HttpComponents, OpenWebBeans, 
TrafficServer,
Web Services, XML Graphics. Well, there is also HTTP Server as well, 
but
that is an anomaly since it is really just the original Apache itself. 
Maybe
the question is what the current consensus preference is in Apache land 
and

trying to go with the flow rather than try to go against the flow.

In short, even if Connectors Framework remains the tail end of the
name, a handle prefix is needed. Apache is the general prefix for ALL
Apache projects and not a handle for any of them. If that handle is
Connecto, the full name could be Connecto Connectors Framework, and 
the
official project name would be Apache Connecto Connectors Framework. 
That
said, I am not a fan of trying to put the project description into the 
name
in raw English form. So, my preference there would be to drop 
Connectors

Framework from the name and stick with Connecto, or whatever other
handle is chosen.

As I said, I will defer to the PMC (?) endorses, but I would hope that
there is some consistency with current and traditional Apache project 
naming

conventions.

-- Jack Krupansky

--
From: Simon Willnauer simon.willna...@googlemail.com
Sent: Thursday, August 26, 2010 7:50 AM
To: Grant Ingersoll gsing...@apache.org
Cc: connectors-dev@incubator.apache.org
Subject: Re: About name change

 On Thu, Aug 26, 2010 at 12:42 PM, Grant Ingersoll 
gsing...@apache.org

wrote:



On Aug 26, 2010, at 6:14 AM, Karl Wright wrote:

 Is it clear that ACF is dead?  The concern raised was that it 
implied

something that connected lots of stuff together, and that's not what
it
was.  But I think that that IS what it is, so the poster knew little
or
nothing about the project, and was operating from ignorance.  Does 
it

make
sense to clarify what ACF does to the general list first?



I think it is worthwhile.  You want to take a crack at it?


Absolutely +1 - I just have the impression that people are already
biased by Tomcat Connector etc. but I will be a supporter of Apache
Connector FW, no doubt. If it is not an option we can still discuss
here!

simon





Karl

On Thu, Aug 26, 2010 at 5:26 AM, Simon Willnauer 
simon.willna...@googlemail.com wrote:

 Hey folks,


I was following the discussion about changing the name to Apache
Connector Framework

Re: About name change

2010-08-30 Thread Jack Krupansky
I suspect those multi-word names kind of sneaked in without the naming 
police having a chance to point out the naming guidelines early in the 
project process.


For the record, I am okay with XYZ Open Connectors Framework or XYZ Content 
Connectors Framework or XYZ Connectors Framework as the full name, with XYZ 
as the official Apache name (or handle as I call it), where XYZ is a 
placeholder for a name as yet to be determined. And Apache gets stuck on 
the front of the name, by convention.


-- Jack Krupansky

--
From: Karl Wright daddy...@gmail.com
Sent: Monday, August 30, 2010 4:50 PM
To: connectors-dev@incubator.apache.org
Subject: Re: About name change

TrafficServer?  OpenWebBeans? XMLBeans?  There are actually a *lot* of 
names

that are multiple words.  They're just mashed together. ;-)

Karl

On Mon, Aug 30, 2010 at 4:44 PM, Mark Miller markrmil...@gmail.com 
wrote:



On 8/30/10 1:37 PM, Karl Wright wrote:
 snip - Consider using functional names, especially for products of
existing
 projects, e.g. for an Apache Foo project, the product name Apache 
 Foo

 Pipelines. -snip

 Granted, Lucene Connectors Framework fills this to a T, but this 
 would

 imply that functional names are OK for top-level projects too.

FYI, these are listed as guidelines, so I don't think they are meant to
determine what is OK or not. A guideline is by definition not mandatory.

It would seem to me that the reason this is emphasized for subprojects
of foo even more so than foo, is that foo will already be a unique
simple abstract name. After you have that, it's best to be descriptive
for sub projects. If you don't have a unique simple abstract 'component'
of the name for a top level project, many of the other guidelines are
not met very well.

Below are some current Apache project names - you start to see a pattern
- notice that most of them will be the top hit on google using simply
the name (yes, including ant, tiles and felix surprisingly ;) ). This
isn't always the case of course - many different historical issues
factor into these names - but as you can see - even just more than one
word for the name is extremely uncommon.

HTTP Server
Abdera
ActiveMQ
Ant
APR
Archiva
Avro
Buildr
Camel
Cassandra
Cayenne
Click
Cocoon
Commons
Continuum
CouchDB
CXF
DB
Directory
Excalibur
Felix
Forrest
Geronimo
Gump
Hadoop
Harmony
HBase
HttpComponents
Jackrabbit
Jakarta
James
Lenya
Logging
Lucene
Mahout
Maven
Mina
MyFaces
Nutch
ODE
OFBiz
OpenEJB
OpenJPA
OpenWebBeans
PDFBox
Perl
Pivot
POI
Portals
Qpid
Roller
Santuario
ServiceMix
Shindig
Sling
SpamAssassin
STDCXX
Struts
Subversion
Synapse
Tapestry
Tika
TCL
Tiles
Tomcat
TrafficServer
Turbine
Tuscany
UIMA
Velocity
Wicket
Web Services
Xalan
Xerces
XML
XMLBeans
XML Graphics


 Karl

 On Mon, Aug 30, 2010 at 1:24 PM, Mark Miller markrmil...@gmail.com
wrote:

 On 8/30/10 1:05 PM, Karl Wright wrote:

 I'm not too keen on just a simple abstract name - too meaningless for
me.

 It works for countless Apache projects (that's really the standard) -
 not really buying it would be a problem here.

 Also, I havn't been following closely, so if someone hasn't pointed it
 out yet, fyi on some recommendations:
 http://www.apache.org/dev/project-names.html

 - Mark









[jira] Commented: (CONNECTORS-98) API should be pure RESTful with the API verb represented using the HTTP GET/PUT/POST/DELETE methods

2010-08-27 Thread Jack Krupansky (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-98?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12903559#action_12903559
 ] 

Jack Krupansky commented on CONNECTORS-98:
--

I'll be mostly looking through code and thinking it through and looking at the 
API string changes first, so I may not touch any code for another week, if not 
longer. Feel free to rename or refactor code at will. I'll probably let you 
know in advance of what changes I expect to make in the code.

 API should be pure RESTful with the API verb represented using the HTTP 
 GET/PUT/POST/DELETE methods
 -

 Key: CONNECTORS-98
 URL: https://issues.apache.org/jira/browse/CONNECTORS-98
 Project: Apache Connectors Framework
  Issue Type: Improvement
  Components: API
Affects Versions: LCF Release 0.5
Reporter: Jack Krupansky
 Fix For: LCF Release 0.5


 (This was originally a comment on CONNECTORS-56 dated 7/16/2010.)
 It has come to my attention that the API would be more pure RESTful if the 
 API verb was represented using the HTTP GET/PUT/POST/DELETE methods and the 
 input argument identifier represented in the context path.
 So,  GET outputconnection/get \{connection_name:_connection_name_\} would 
 be GET outputconnections/connection_name
 and GET outputconnection/delete \{connection_name:_connection_name_\} 
 would be DELETE outputconnections/connection_name
 and GET outputconnection/list would be GET outputconnections
 and PUT outputconnection/save 
 \{outputconnection:_output_connection_object_\} would be PUT 
 outputconnections/connection_name 
 \{outputconnection:_output_connection_object_\}
 What we have today is certainly workable, but just not as pure as some 
 might desire. It would be better to take care of this before the initial 
 release so that we never have to answer the question of why it wasn't done as 
 a proper RESTful API.
 BTW, I did check to verify that an HttpServlet running under Jetty can 
 process the DELETE and PUT methods (using the doDelete and doPut method 
 overrides.)
 Also, POST should be usable as an alternative to PUT for API calls that have 
 large volumes of data.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Re: About name change

2010-08-26 Thread Jack Krupansky
Personally, I'd rather see a traditional, Apache-style name, but I can 
certainly live with whatever the PMC (?) endorses.


I agree with the general@ criticism that the ACF name comes across as being 
the ultimate end-all connector framework for Apache land (land grab). We 
should acknowledge that in the future there might be other projects that 
seek to offer connector frameworks in Apache land. There really should be 
a handle to qualify the purely descriptive portion of the name - and we 
had one: Lucene, but it wasn't unique and even there did not acknowledge 
that in the future there could be other connector frameworks.


Note: We effectively have a handle name today: LCF or ACF, but it is a 
distinctly non-Apache style of name. Why not go with an Apache-style name. 
That said, I do see that there are a minority of Apache Projects that have 
descriptive names, including HttpComponents, OpenWebBeans, TrafficServer, 
Web Services, XML Graphics. Well, there is also HTTP Server as well, but 
that is an anomaly since it is really just the original Apache itself. Maybe 
the question is what the current consensus preference is in Apache land and 
trying to go with the flow rather than try to go against the flow.


In short, even if Connectors Framework remains the tail end of the name, a 
handle prefix is needed. Apache is the general prefix for ALL Apache 
projects and not a handle for any of them. If that handle is Connecto, the 
full name could be Connecto Connectors Framework, and the official project 
name would be Apache Connecto Connectors Framework. That said, I am not a 
fan of trying to put the project description into the name in raw English 
form. So, my preference there would be to drop Connectors Framework from 
the name and stick with Connecto, or whatever other handle is chosen.


As I said, I will defer to the PMC (?) endorses, but I would hope that there 
is some consistency with current and traditional Apache project naming 
conventions.


-- Jack Krupansky

--
From: Simon Willnauer simon.willna...@googlemail.com
Sent: Thursday, August 26, 2010 7:50 AM
To: Grant Ingersoll gsing...@apache.org
Cc: connectors-dev@incubator.apache.org
Subject: Re: About name change

On Thu, Aug 26, 2010 at 12:42 PM, Grant Ingersoll gsing...@apache.org 
wrote:


On Aug 26, 2010, at 6:14 AM, Karl Wright wrote:


Is it clear that ACF is dead?  The concern raised was that it implied
something that connected lots of stuff together, and that's not what it
was.  But I think that that IS what it is, so the poster knew little or
nothing about the project, and was operating from ignorance.  Does it 
make

sense to clarify what ACF does to the general list first?


I think it is worthwhile.  You want to take a crack at it?

Absolutely +1 - I just have the impression that people are already
biased by Tomcat Connector etc. but I will be a supporter of Apache
Connector FW, no doubt. If it is not an option we can still discuss
here!

simon




Karl

On Thu, Aug 26, 2010 at 5:26 AM, Simon Willnauer 
simon.willna...@googlemail.com wrote:


Hey folks,

I was following the discussion about changing the name to Apache
Connector Framework and the late response from people on gene...@.
Obviously we need to decide on something else than Apache Connectors
Framework since many people had concerns about the name and possible
confusion. I have the impression we should first collect some
suggestions about alternative names here before we continue discussion
on the gene...@. Once we have a name we all agreed on and doesn't
apply to the concerns others had we should go back and discuss
further.
Some folks suggested a more abstract name like Apache Connecto which I
personally like (not necessarily Connecto but a more abstract name.
Such names have many advantages as people remember short names and
they are less ambiguous.

Any suggestions, thoughts?

simon



--
Grant Ingersoll
http://lucenerevolution.org Lucene/Solr Conference, Boston Oct 7-8




[jira] Created: (CONNECTORS-98) API should be pure RESTful with the API verb represented using the HTTP GET/PUT/POST/DELETE methods

2010-08-26 Thread Jack Krupansky (JIRA)
API should be pure RESTful with the API verb represented using the HTTP 
GET/PUT/POST/DELETE methods
-

 Key: CONNECTORS-98
 URL: https://issues.apache.org/jira/browse/CONNECTORS-98
 Project: Apache Connectors Framework
  Issue Type: Improvement
  Components: API
Affects Versions: LCF Release 0.5
Reporter: Jack Krupansky
 Fix For: LCF Release 0.5


(This was originally a comment on CONNECTORS-56 dated 7/16/2010.)

It has come to my attention that the API would be more pure RESTful if the 
API verb was represented using the HTTP GET/PUT/POST/DELETE methods and the 
input argument identifier represented in the context path.

So,  GET outputconnection/get \{connection_name:_connection_name_\} would 
be GET outputconnections/connection_name

and GET outputconnection/delete \{connection_name:_connection_name_\} would 
be DELETE outputconnections/connection_name

and GET outputconnection/list would be GET outputconnections

and PUT outputconnection/save 
\{outputconnection:_output_connection_object_\} would be PUT 
outputconnections/connection_name 
\{outputconnection:_output_connection_object_\}

What we have today is certainly workable, but just not as pure as some might 
desire. It would be better to take care of this before the initial release so 
that we never have to answer the question of why it wasn't done as a proper 
RESTful API.

BTW, I did check to verify that an HttpServlet running under Jetty can process 
the DELETE and PUT methods (using the doDelete and doPut method overrides.)

Also, POST should be usable as an alternative to PUT for API calls that have 
large volumes of data.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (CONNECTORS-98) API should be pure RESTful with the API verb represented using the HTTP GET/PUT/POST/DELETE methods

2010-08-26 Thread Jack Krupansky (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-98?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12902982#action_12902982
 ] 

Jack Krupansky commented on CONNECTORS-98:
--

Karl asks what do you plan to do for the list and execute verbs?

List would be a GET and execute would be PUT.


 API should be pure RESTful with the API verb represented using the HTTP 
 GET/PUT/POST/DELETE methods
 -

 Key: CONNECTORS-98
 URL: https://issues.apache.org/jira/browse/CONNECTORS-98
 Project: Apache Connectors Framework
  Issue Type: Improvement
  Components: API
Affects Versions: LCF Release 0.5
Reporter: Jack Krupansky
 Fix For: LCF Release 0.5


 (This was originally a comment on CONNECTORS-56 dated 7/16/2010.)
 It has come to my attention that the API would be more pure RESTful if the 
 API verb was represented using the HTTP GET/PUT/POST/DELETE methods and the 
 input argument identifier represented in the context path.
 So,  GET outputconnection/get \{connection_name:_connection_name_\} would 
 be GET outputconnections/connection_name
 and GET outputconnection/delete \{connection_name:_connection_name_\} 
 would be DELETE outputconnections/connection_name
 and GET outputconnection/list would be GET outputconnections
 and PUT outputconnection/save 
 \{outputconnection:_output_connection_object_\} would be PUT 
 outputconnections/connection_name 
 \{outputconnection:_output_connection_object_\}
 What we have today is certainly workable, but just not as pure as some 
 might desire. It would be better to take care of this before the initial 
 release so that we never have to answer the question of why it wasn't done as 
 a proper RESTful API.
 BTW, I did check to verify that an HttpServlet running under Jetty can 
 process the DELETE and PUT methods (using the doDelete and doPut method 
 overrides.)
 Also, POST should be usable as an alternative to PUT for API calls that have 
 large volumes of data.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (CONNECTORS-98) API should be pure RESTful with the API verb represented using the HTTP GET/PUT/POST/DELETE methods

2010-08-26 Thread Jack Krupansky (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-98?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12902983#action_12902983
 ] 

Jack Krupansky commented on CONNECTORS-98:
--

Karl says I await your patch.

Point well made. There is a great starting point with the current code. A bit 
of refactoring required.


 API should be pure RESTful with the API verb represented using the HTTP 
 GET/PUT/POST/DELETE methods
 -

 Key: CONNECTORS-98
 URL: https://issues.apache.org/jira/browse/CONNECTORS-98
 Project: Apache Connectors Framework
  Issue Type: Improvement
  Components: API
Affects Versions: LCF Release 0.5
Reporter: Jack Krupansky
 Fix For: LCF Release 0.5


 (This was originally a comment on CONNECTORS-56 dated 7/16/2010.)
 It has come to my attention that the API would be more pure RESTful if the 
 API verb was represented using the HTTP GET/PUT/POST/DELETE methods and the 
 input argument identifier represented in the context path.
 So,  GET outputconnection/get \{connection_name:_connection_name_\} would 
 be GET outputconnections/connection_name
 and GET outputconnection/delete \{connection_name:_connection_name_\} 
 would be DELETE outputconnections/connection_name
 and GET outputconnection/list would be GET outputconnections
 and PUT outputconnection/save 
 \{outputconnection:_output_connection_object_\} would be PUT 
 outputconnections/connection_name 
 \{outputconnection:_output_connection_object_\}
 What we have today is certainly workable, but just not as pure as some 
 might desire. It would be better to take care of this before the initial 
 release so that we never have to answer the question of why it wasn't done as 
 a proper RESTful API.
 BTW, I did check to verify that an HttpServlet running under Jetty can 
 process the DELETE and PUT methods (using the doDelete and doPut method 
 overrides.)
 Also, POST should be usable as an alternative to PUT for API calls that have 
 large volumes of data.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Re: [VOTE]: Change svn root for Apache Connectors Framework?

2010-08-25 Thread Jack Krupansky
I am in favor of the change assuming the name change is officially official 
despite the chatter on gene...@incubator.apache.org, but I'd like to see 
some confirmation that the grumbling has subsided over there.


-- Jack Krupansky

--
From: Karl Wright daddy...@gmail.com
Sent: Wednesday, August 25, 2010 8:23 AM
To: connectors-dev connectors-dev@incubator.apache.org
Subject: [VOTE]: Change svn root for Apache Connectors Framework?


Vote +1 to change the root of the svn repository for Apache Connectors
Framework from:

https://svn.apache.org/repos/asf/incubator/lcf

to:

https://svn.apache.org/repos/asf/incubator/acf

Vote will remain open until 5:00 PM Friday, August 27th, Boston time 
(EDT).


(I'm trying to do this right this time, so let me know if I still don't 
have

the process quite correct.)
Thanks,
Karl



Re: [VOTE]: Change svn root for Apache Connectors Framework?

2010-08-25 Thread Jack Krupansky
I am not subscribed to that list either, but I did a Google search and found 
your original message to that list here:


http://www.mail-archive.com/gene...@incubator.apache.org/msg25229.html

With links to responses at the bottom, for any of the rest of us who are not 
subscribers but want to read what the discussion was, or still is.


-- Jack Krupansky

--
From: karl.wri...@nokia.com
Sent: Wednesday, August 25, 2010 9:25 AM
To: connectors-dev@incubator.apache.org
Subject: RE: [VOTE]: Change svn root for Apache Connectors Framework?

I'm not subscribed to that list - I've been going on what Grant posted to 
connectors-dev about the decision being made.  If it's going to be undone 
I'd sure like to know.


Karl

-Original Message-
From: ext Jack Krupansky [mailto:jack.krupan...@lucidimagination.com]
Sent: Wednesday, August 25, 2010 9:17 AM
To: connectors-dev@incubator.apache.org
Subject: Re: [VOTE]: Change svn root for Apache Connectors Framework?

I am in favor of the change assuming the name change is officially 
official

despite the chatter on gene...@incubator.apache.org, but I'd like to see
some confirmation that the grumbling has subsided over there.

-- Jack Krupansky

--
From: Karl Wright daddy...@gmail.com
Sent: Wednesday, August 25, 2010 8:23 AM
To: connectors-dev connectors-dev@incubator.apache.org
Subject: [VOTE]: Change svn root for Apache Connectors Framework?


Vote +1 to change the root of the svn repository for Apache Connectors
Framework from:

https://svn.apache.org/repos/asf/incubator/lcf

to:

https://svn.apache.org/repos/asf/incubator/acf

Vote will remain open until 5:00 PM Friday, August 27th, Boston time
(EDT).

(I'm trying to do this right this time, so let me know if I still don't
have
the process quite correct.)
Thanks,
Karl



Re: Need an opinion, on whether to change package or not

2010-08-22 Thread Jack Krupansky

+1

-- Jack Krupansky

--
From: Karl Wright daddy...@gmail.com
Sent: Sunday, August 22, 2010 1:49 PM
To: connectors-dev connectors-dev@incubator.apache.org
Subject: Need an opinion, on whether to change package or not


Consider this an official request for a vote.

+1 indicates you think we should change the following in the source code, 
as

soon as is practical:

org.apache.lcf.xxx - org.apache.acf.xxx
All classes LCF.java and LCFException.java should change to ACF.java and
ACFException.java

Bear in mind that users of ACF/LCF who currently have existing database
instances will need to reinitialize those instances if we do this change.
This is because the class names of connectors are stored in the database
when the connector is registered.

(FWIW, my vote on this is -1.  It doesn't seem worth the disruption.  But 
I

will of course abide by the consensus.)

Vote will be considered closed by Wednesday evening, so vote early (and
often. ;-))
Karl



[jira] Commented: (CONNECTORS-56) All features should be accessible through an API

2010-07-14 Thread Jack Krupansky (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-56?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12888377#action_12888377
 ] 

Jack Krupansky commented on CONNECTORS-56:
--

Some cURL and/or Perl test scripts to illustrate use of the API would be 
helpful.

 All features should be accessible through an API
 

 Key: CONNECTORS-56
 URL: https://issues.apache.org/jira/browse/CONNECTORS-56
 Project: Lucene Connector Framework
  Issue Type: Sub-task
  Components: Framework core
Reporter: Jack Krupansky

 LCF consists of a full-featured crawling engine and a full-featured user 
 interface to access the features of that engine, but some applications are 
 better served with a full API that lets the application control the crawling 
 engine, including creation and editing of connections and creation, editing, 
 and control of jobs. Put simply, everything that a user can accomplish via 
 the LCF UI should be doable through an LCF API. All LCF objects should be 
 queryable through the API.
 A primary use case is Solr applications which currently use Aperture for 
 crawling, but would prefer the full-featured capabilities of LCF as a 
 crawling engine over Aperture.
 I do not wish to over-specify the API in this initial description, but I 
 think the LCF API should probably be a traditional REST API., with some of 
 the API elements specified via the context path, some parameters via URL 
 query parameters, and complex, detailed structures as JSON (or similar.). The 
 precise details of the API are beyond the scope of this initial description 
 and will be added incrementally once the high-level approach to the API 
 becomes reasonably settled.
 A job status and event reporting scheme is also needed in conjunction with 
 the LCF API. That requirement has already been captured as CONNECTORS-41.
 The intention for the API is to create, edit, access, and control all of the 
 objects managed by LCF. The main focus is on repositories, jobs, and status, 
 and less about document-specific crawling information, but there may be some 
 benefit to querying crawling status for individual documents as well.
 Nothing in this proposal should in any way limit or constrain the features 
 that will be available in the LCF UI. The intent is that LCF should continue 
 to have a full-featured UI, but in addition to a full-featured API.
 Note: This issue is part of Phase 2 of the CONNECTORS-50 umbrella issue.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (CONNECTORS-60) Agent process should be started automatically

2010-07-13 Thread Jack Krupansky (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-60?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12888000#action_12888000
 ] 

Jack Krupansky commented on CONNECTORS-60:
--

Unless I am mistaken, the jetty integration is for QuickStart (single process) 
only. The issue is for non-QuickStart, multi-process execution.


 Agent process should be started automatically
 -

 Key: CONNECTORS-60
 URL: https://issues.apache.org/jira/browse/CONNECTORS-60
 Project: Lucene Connector Framework
  Issue Type: Sub-task
Reporter: Jack Krupansky

 LCF as it exists today is a bit too complex to run for an average user, 
 especially with a separate agent process for crawling. LCF should be as easy 
 to run as Solr is today. QuickStart is a good move in this direction, but the 
 same user-visible simplicity is needed for full LCF. The separate agent 
 process is a reasonable design for execution, but a little too cumbersome for 
 the average user to manage.
 Unfortunately, it is expected that starting up a multi-process application 
 will require platform-specific scripting.
 Note: This issue is part of Phase 1 of the CONNECTORS-50 umbrella issue.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Re: JSON license

2010-07-13 Thread Jack Krupansky
I'm no expert on licenses, so I would look at what Solr does and maybe ping 
Grant. Presumably LCF should have a legal subdirectory somewhere for 
license notices.


-- Jack Krupansky

--
From: karl.wri...@nokia.com
Sent: Tuesday, July 13, 2010 5:02 PM
To: jack.krupan...@lucidimagination.com
Cc: connectors-dev@incubator.apache.org
Subject: RE: JSON license

Can you clarify what is meant by add this license to /legal?  And what 
the update to NOTICES.TXT should look like?  Something like this?


Apache Lucene Connector Framework
Copyright 2010 The Apache Software Foundation

This product includes software developed by
The Apache Software Foundation (http://www.apache.org/).

Portions of the software are licensed as follows:

Copyright (c) 2002 JSON.org
Permission is hereby granted, free of charge, to any person obtaining a 
copy of this software and associated documentation
files (the Software), to deal in the Software without restriction, 
including without limitation the rights to use, copy, modify,
merge, publish, distribute, sublicense, and/or sell copies of the 
Software, and to permit persons to whom the Software is furnished

to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in 
all copies or substantial portions of the Software.

The Software shall be used for Good, not Evil.
THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 
IMPLIED, INCLUDING BUT NOT LIMITED
TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND 
NONINFRINGEMENT. IN NO EVENT SHALL
THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 
LIABILITY, WHETHER IN AN ACTION OF
CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH 
THE SOFTWARE OR THE USE OR OTHER

DEALINGS IN THE SOFTWARE.



Karl





Re: [jira] Commented: (CONNECTORS-60) Agent process should be started automatically

2010-07-13 Thread Jack Krupansky
That would help. Keep in mind the Jira issue for bundling the database 
server as well.


I was assuming that there was still some technical advantage to running LCF 
in the non-QuickStart multi-process configuration.


-- Jack Krupansky

--
From: karl.wri...@nokia.com
Sent: Tuesday, July 13, 2010 5:09 PM
To: connectors-dev@incubator.apache.org
Subject: RE: [jira] Commented: (CONNECTORS-60) Agent process should be 
started automatically


So all you want to see is a postgresql version of QuickStart?  That's 
actually trivial - it's a one-line modification to the properties.xml 
file.  My suggestion is to simply address this with documentation.







[jira] Commented: (CONNECTORS-55) Bundle database server with LCF packaged product

2010-07-09 Thread Jack Krupansky (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-55?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12886720#action_12886720
 ] 

Jack Krupansky commented on CONNECTORS-55:
--

Karl notes that we've had to mess with the stuffer query on pretty near every 
point release of Postgresql. Letting/forcing the user to pick the 
right/acceptable release of PostgreSQL to install is error prone and a support 
headache. I would argue that it is better for the LCF team to bundle the 
right/best release of PostgreSQL with LCF.

 Bundle database server with LCF packaged product
 

 Key: CONNECTORS-55
 URL: https://issues.apache.org/jira/browse/CONNECTORS-55
 Project: Lucene Connector Framework
  Issue Type: Improvement
  Components: Framework core
Reporter: Jack Krupansky

 The current requirement that the user install and deploy a PostgreSQL server 
 complicates the installation and deployment of LCF for the user. Installation 
 and deployment of LCF should be as simple as Solr itself. QuickStart is great 
 for the low-end and basic evaluation, but a comparable level of simplified 
 installation and deployment is still needed for full-blown, high-end 
 environments that need the full performance of a ProstgreSQL-class database 
 server. So, PostgreSQL should be bundled with the packaged release of LCF so 
 that installation and deployment of LCF will automatically install and deploy 
 a subset of the full PostgreSQL distribution that is sufficient for the needs 
 of LCF. Starting LCF, with or without the LCF UI, should automatically start 
 the database server. Shutting down LCF should also shutdown the database 
 server process.
 A typical use case would be for a non-developer who is comfortable with Solr 
 and simply wants to crawl documents from, for example, a SharePoint 
 repository and feed them into Solr. QuickStart should work well for the low 
 end or in the early stages of evaluation, but the user would prefer to 
 evaluate the real thing with something resembling a production crawl of 
 thousands of documents. Such a user might not be a hard-core developer or be 
 comfortable fiddling with a lot of software components simply to do one 
 conceptually simple operation.
 It should still be possible for the user to supply database server settings 
 to override the defaults, but the LCF package should have all of the 
 best-practice settings deemed appropriate for use with LCF.
 One downside is that installation and deployment will be platform-specific 
 since there are multiple processes and PostgreSQL itself requires a 
 platform-specific installation.
 This proposal presumes that PostgreSQL is the best option for the foreseeable 
 future, but nothing here is intended to preclude support for other database 
 servers in futures releases.
 This proposal should not have any impact on QuickStart packaging or 
 deployment.
 Note: This issue is part of Phase 1 of the CONNECTORS-50 umbrella issue.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (CONNECTORS-55) Bundle database server with LCF packaged product

2010-07-09 Thread Jack Krupansky (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-55?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12886724#action_12886724
 ] 

Jack Krupansky commented on CONNECTORS-55:
--

When Karl says It *does* limit your ability to use other commands 
simultaneously (referring to use of embedded Derby), he is referring to 
commands executed using the executecommand shell script, such as registering 
and unregistering connectors, which is something typically done once before 
starting the UI or once every blue moon when you want to support a new type of 
repository, but not done on as regular a basis as editing connections and jobs 
and running jobs. The java classes to execute those commands would be, by 
definition, outside of the LCF process.

 Bundle database server with LCF packaged product
 

 Key: CONNECTORS-55
 URL: https://issues.apache.org/jira/browse/CONNECTORS-55
 Project: Lucene Connector Framework
  Issue Type: Improvement
  Components: Framework core
Reporter: Jack Krupansky

 The current requirement that the user install and deploy a PostgreSQL server 
 complicates the installation and deployment of LCF for the user. Installation 
 and deployment of LCF should be as simple as Solr itself. QuickStart is great 
 for the low-end and basic evaluation, but a comparable level of simplified 
 installation and deployment is still needed for full-blown, high-end 
 environments that need the full performance of a ProstgreSQL-class database 
 server. So, PostgreSQL should be bundled with the packaged release of LCF so 
 that installation and deployment of LCF will automatically install and deploy 
 a subset of the full PostgreSQL distribution that is sufficient for the needs 
 of LCF. Starting LCF, with or without the LCF UI, should automatically start 
 the database server. Shutting down LCF should also shutdown the database 
 server process.
 A typical use case would be for a non-developer who is comfortable with Solr 
 and simply wants to crawl documents from, for example, a SharePoint 
 repository and feed them into Solr. QuickStart should work well for the low 
 end or in the early stages of evaluation, but the user would prefer to 
 evaluate the real thing with something resembling a production crawl of 
 thousands of documents. Such a user might not be a hard-core developer or be 
 comfortable fiddling with a lot of software components simply to do one 
 conceptually simple operation.
 It should still be possible for the user to supply database server settings 
 to override the defaults, but the LCF package should have all of the 
 best-practice settings deemed appropriate for use with LCF.
 One downside is that installation and deployment will be platform-specific 
 since there are multiple processes and PostgreSQL itself requires a 
 platform-specific installation.
 This proposal presumes that PostgreSQL is the best option for the foreseeable 
 future, but nothing here is intended to preclude support for other database 
 servers in futures releases.
 This proposal should not have any impact on QuickStart packaging or 
 deployment.
 Note: This issue is part of Phase 1 of the CONNECTORS-50 umbrella issue.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (CONNECTORS-55) Bundle database server with LCF packaged product

2010-07-08 Thread Jack Krupansky (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-55?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12886490#action_12886490
 ] 

Jack Krupansky commented on CONNECTORS-55:
--

I was using the term install loosely, not so much the way a typical package 
has a GUI wizard and lots of stuff going on, but more in the sense of raw Solr 
where you download, unzip, and files are in sub directories right where they 
need to be. In that sense, the theory is that a subset of PostgreSQL could be 
in a subdirectory.

Some enterprising vendor, such as Lucid Imagination, might want to have a fancy 
GUI install, but that would be beyond the scope of what I intended here.


 Bundle database server with LCF packaged product
 

 Key: CONNECTORS-55
 URL: https://issues.apache.org/jira/browse/CONNECTORS-55
 Project: Lucene Connector Framework
  Issue Type: Improvement
  Components: Framework core
Reporter: Jack Krupansky

 The current requirement that the user install and deploy a PostgreSQL server 
 complicates the installation and deployment of LCF for the user. Installation 
 and deployment of LCF should be as simple as Solr itself. QuickStart is great 
 for the low-end and basic evaluation, but a comparable level of simplified 
 installation and deployment is still needed for full-blown, high-end 
 environments that need the full performance of a ProstgreSQL-class database 
 server. So, PostgreSQL should be bundled with the packaged release of LCF so 
 that installation and deployment of LCF will automatically install and deploy 
 a subset of the full PostgreSQL distribution that is sufficient for the needs 
 of LCF. Starting LCF, with or without the LCF UI, should automatically start 
 the database server. Shutting down LCF should also shutdown the database 
 server process.
 A typical use case would be for a non-developer who is comfortable with Solr 
 and simply wants to crawl documents from, for example, a SharePoint 
 repository and feed them into Solr. QuickStart should work well for the low 
 end or in the early stages of evaluation, but the user would prefer to 
 evaluate the real thing with something resembling a production crawl of 
 thousands of documents. Such a user might not be a hard-core developer or be 
 comfortable fiddling with a lot of software components simply to do one 
 conceptually simple operation.
 It should still be possible for the user to supply database server settings 
 to override the defaults, but the LCF package should have all of the 
 best-practice settings deemed appropriate for use with LCF.
 One downside is that installation and deployment will be platform-specific 
 since there are multiple processes and PostgreSQL itself requires a 
 platform-specific installation.
 This proposal presumes that PostgreSQL is the best option for the foreseeable 
 future, but nothing here is intended to preclude support for other database 
 servers in futures releases.
 This proposal should not have any impact on QuickStart packaging or 
 deployment.
 Note: This issue is part of Phase 1 of the CONNECTORS-50 umbrella issue.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (CONNECTORS-56) All features should be accessible through an API

2010-07-08 Thread Jack Krupansky (JIRA)
All features should be accessible through an API


 Key: CONNECTORS-56
 URL: https://issues.apache.org/jira/browse/CONNECTORS-56
 Project: Lucene Connector Framework
  Issue Type: Improvement
  Components: Framework core
Reporter: Jack Krupansky


LCF consists of a full-featured crawling engine and a full-featured user 
interface to access the features of that engine, but some applications are 
better served with a full API that lets the application control the crawling 
engine, including creation and editing of connections and creation, editing, 
and control of jobs. Put simply, everything that a user can accomplish via the 
LCF UI should be doable through an LCF API. All LCF objects should be queryable 
through the API.

A primary use case is Solr applications which currently use Aperture for 
crawling, but would prefer the full-featured capabilities of LCF as a crawling 
engine over Aperture.

I do not wish to over-specify the API in this initial description, but I think 
the LCF API should probably be a traditional REST API., with some of the API 
elements specified via the context path, some parameters via URL query 
parameters, and complex, detailed structures as JSON (or similar.). The precise 
details of the API are beyond the scope of this initial description and will be 
added incrementally once the high-level approach to the API becomes reasonably 
settled.

A job status and event reporting scheme is also needed in conjunction with the 
LCF API. That requirement has already been captured as CONNECTORS-41.

The intention for the API is to create, edit, access, and control all of the 
objects managed by LCF. The main focus is on repositories, jobs, and status, 
and less about document-specific crawling information, but there may be some 
benefit to querying crawling status for individual documents as well.

Nothing in this proposal should in any way limit or constrain the features that 
will be available in the LCF UI. The intent is that LCF should continue to have 
a full-featured UI, but in addition to a full-featured API.

Note: This issue is part of Phase 2 of the CONNECTORS-50 umbrella issue.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (CONNECTORS-50) Proposal for initial two releases of LCF, including packaged product and full API

2010-06-30 Thread Jack Krupansky (JIRA)
Proposal for initial two releases of LCF, including packaged product and full 
API
-

 Key: CONNECTORS-50
 URL: https://issues.apache.org/jira/browse/CONNECTORS-50
 Project: Lucene Connector Framework
  Issue Type: New Feature
  Components: Framework core
Reporter: Jack Krupansky


Currently, LCF has a relatively high-bar or evaluation and use, requiring 
developer expertise. Also, although LCF has a comprehensive UI, it is not 
currently packaged for use as a crawling engine for advanced applications.

A small set of individual feature requests are needed to address these issues. 
They are summarized briefly to show how they fit together for two initial 
releases of LCF, but will be broken out into individual LCF Jira issues.

Goals:

1. LCF as a standalone, downloadable, usable-out-of-the-box product (much as 
Solr is today)
2. LCF as a toolkit for developers needing customized crawling and repository 
access
3. An API-based crawling engine that can be integrated with applications (as 
Aperture is today)

Larger goals:

1. Make it very easy for users to evaluate LCF.
2. Make it very easy for developers to customize LCF.
3. Make it very easy for appplications to fully manage and control LCF in 
operation.

Two phases:

1) Standalone, packaged app that is super-easy to evaluate and deploy. Call it 
LCF 0.5.
2) API-based crawling engine for applications for which the UI might not be 
appropriate. Call it LCF 1.0.


Phase 1
---

LCF 0.5 right out of the box would interface loosely with Solr 1.4 or later.
It would contain roughly the features that are currently in place or currently 
underway, plus a little more.

Specifically, LCF 0.5 would contain these additional capabilities:

1. Plug-in architecture for connectors (already underway)
2. Packaged app ready to run with embedded Jetty app server (I think this has 
been agreed to)
3. Bundled with database - PostgreSQL or derby - ready to run without 
additional manual setup
4. Mini-API to initially configure default connections and example jobs for 
file system and web crawl
5. Agent process started automatically (platform-specific startup required)
6. Solr output connector option to commit at end of job, by default

Installation and basic evaluation of LCF would be essentially as simple as Solr 
is today. The example
connections and jobs would permit the user to initiate example crawls of a file 
system example
directory and an example web on the LCF web site with just a couple of clicks 
(as opposed to the
detailed manual setup required today to create repository and output 
connections and jobs.

It is worth considering whether the SharePoint connector could also be included 
as part of the default package.

Users could then add additional connectors and repositories and jobs as desired.

Timeframe for release? Level of effort?

Phase 2
---

The essence of Phase 2 is that LCF would be split to allow direct, full API 
access to LCF as a
crawling engine, in additional to the full LCF UI. Call this LCF 1.0.

Specifically, LCF 1.0 would contain these additional capabilities:

1. Full API for LCF as a crawling engine
2. LCF can be bundled within an app (such as the default LCF package itself 
with its UI)
3. LCF event and activity notification for full control by an application 
(already a Jira request)

Overall, LCF will offer roughly the same crawling capabilities as with LCF 0.5, 
plus whatever bug
fixes and minor enhancements might also be added.

Timeframe for release? Level of effort?

-

Issues:

- Can we package PostgreSQL with LCF so LCF can set it up?
  - Or do we need Derby for that purpose?
- Managing multiple processes (UI, database, agent, app processes)
- What exactly would the API look like? (URL, XML, JSON, YAML?)


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Re: Derby/JUnit bad interaction - any ideas?

2010-06-08 Thread Jack Krupansky
If we need to require Java 1.6, that is probably okay. I am fine with that. 
Does anybody have a serious objection to requiring Java 1.6 for LCF?


-- Jack Krupansky

--
From: karl.wri...@nokia.com
Sent: Tuesday, June 08, 2010 6:35 AM
To: connectors-dev@incubator.apache.org
Subject: Derby/JUnit bad interaction - any ideas?

I've been trying to get some basic tests working under Junit. 
Unfortunately, I've run into a Derby problem which prevents these tests 
from working.


What happens is this.  Derby, when it creates a database, forces a number 
of directories within the database to read-only.  Unfortunately, unless 
we stipulate Java 1.6 or up, there is no native Java way to make these 
directories become non-read-only.  So database cleanup always fails to 
actually remove the old database, and then new database creation 
subsequently fails.


So there are two possibilities.  First, we can change things so we never 
actually try to clean up the Derby DB.  Second, we can mandate the java 
1.6 is used for LCF.  That's all there really is.


The first possibility is tricky but doable - I think.  The second would 
probably be unacceptable in many ways.


Thoughts?

Karl






Re: Derby

2010-06-03 Thread Jack Krupansky

Just to be clear, the full sequence would be:

1) Start UI app. Agent process should not be running.
2) Start LCF job in UI.
3) Shutdown UI app. Not just close the browser window.
4) AgentRun.
5) Wait long enough for crawl to have finished. Maybe watch to see that Solr 
has become idle.

6) Possibly commit to Solr.
7) AgentStop.
8) Back to step 1 for additional jobs.

Correct?

-- Jack Krupansky

--
From: karl.wri...@nokia.com
Sent: Thursday, June 03, 2010 7:24 PM
To: connectors-dev@incubator.apache.org
Subject: RE: Derby

The daemon does not need to interact with the UI directly, only with the 
database.  So, you stop the UI, start the daemon, and after a while, shut 
down the daemon and restart the UI.


Karl

-Original Message-
From: ext Jack Krupansky [mailto:jack.krupan...@lucidimagination.com]
Sent: Thursday, June 03, 2010 5:51 PM
To: connectors-dev@incubator.apache.org
Subject: Re: Derby

(1) You can't run more than one LCF process at a time.  That means 
you

need to either run the daemon or the crawler-ui web application, but you
can't run both at the same time.


How do you Start a crawl then if not in the web app which then starts 
the

agent process crawling?

Thanks for all of this effort!

-- Jack Krupansky

--
From: karl.wri...@nokia.com
Sent: Thursday, June 03, 2010 5:34 PM
To: connectors-dev@incubator.apache.org
Subject: Derby


For what it's worth, after some 5 days of work, and a couple of schema
changes to boot, LCF now runs with Derby.
Some caveats:

(1) You can't run more than one LCF process at a time.  That means 
you

need to either run the daemon or the crawler-ui web application, but you
can't run both at the same time.
(2) I haven't tested every query, so I'm sure there are probably some
that are still broken.
(3) It's slow.  Count yourself as fortunate if it runs 1/5 the rate 
of

Postgresql for you.
(4) Transactional integrity hasn't been evaluated.
(5) Deadlock detection and unique constraint violation detection is
probably not right, because I'd need to cause these errors to occur 
before

being able to key off their exception messages.
(6) I had to turn off the ability to sort on certain columns in the
reports - basically, any column that was represented as a large character
field.

Nevertheless, this represents an important milestone on the path to being
able to write some kind of unit tests that have at least some meaning.

If you have an existing LCF Postgresql database, you will need to force 
an

upgrade after going to the new trunk code.  To do this, repeat the
org.apache.lcf.agents.Install command, and the
org.apache.lcf.agents.Register
org.apache.lcf.crawler.system.CrawlerAgent command after deploying the
new code.  And, please, let me know of any kind of errors you notice that
could be related to the schema change.

Thanks,
Karl





Re: Derby

2010-06-03 Thread Jack Krupansky
What is the nature of the single LCF process issue? Is it because the 
database is being used in single-user mode, or some other issue? Is it a 
permanent issue, or is there a solution or workaround anticipated at some 
stage.


Thanks.

-- Jack Krupansky

--
From: karl.wri...@nokia.com
Sent: Thursday, June 03, 2010 5:34 PM
To: connectors-dev@incubator.apache.org
Subject: Derby

For what it's worth, after some 5 days of work, and a couple of schema 
changes to boot, LCF now runs with Derby.

Some caveats:

(1) You can't run more than one LCF process at a time.  That means you 
need to either run the daemon or the crawler-ui web application, but you 
can't run both at the same time.
(2) I haven't tested every query, so I'm sure there are probably some 
that are still broken.
(3) It's slow.  Count yourself as fortunate if it runs 1/5 the rate of 
Postgresql for you.

(4) Transactional integrity hasn't been evaluated.
(5) Deadlock detection and unique constraint violation detection is 
probably not right, because I'd need to cause these errors to occur before 
being able to key off their exception messages.
(6) I had to turn off the ability to sort on certain columns in the 
reports - basically, any column that was represented as a large character 
field.


Nevertheless, this represents an important milestone on the path to being 
able to write some kind of unit tests that have at least some meaning.


If you have an existing LCF Postgresql database, you will need to force an 
upgrade after going to the new trunk code.  To do this, repeat the 
org.apache.lcf.agents.Install command, and the 
org.apache.lcf.agents.Register 
org.apache.lcf.crawler.system.CrawlerAgent command after deploying the 
new code.  And, please, let me know of any kind of errors you notice that 
could be related to the schema change.


Thanks,
Karl





Re: Some more thoughts on a classloader plug-in style architecture

2010-06-02 Thread Jack Krupansky

Good point. LCF is a bit more complex than Solr in that sense.

Maybe a separate class is needed that has methods to retrieve the crawl and 
UI components of a connector.


Or a small XML file with whatever info about the connector is needed. Or 
maybe it is simple enough for a properties file.


Or maybe just a naming convention so that the name of the UI component can 
be deduced given the logical name of a connector.


-- Jack Krupansky

--
From: karl.wri...@nokia.com
Sent: Wednesday, June 02, 2010 6:45 AM
To: connectors-dev@incubator.apache.org
Subject: Some more thoughts on a classloader plug-in style architecture

It occurred to me that a classloader plug-in reader for LCF would not 
achieve the goal of allowing a fully prebuilt LCF with connector add-ons. 
The reason, which should have been obvious from the beginning, is because 
each connector consists not only of the Java implementation, but also a UI 
component.  The UI component will need a mechanism similar to the 
classloader one in order for everything to work.


It is possible, I suppose, for precompiled JSP's to be class-loaded 
instead of uncompiled JSP's in the lcf-crawler-ui.war file.  However this 
is going to require some care and finesse (and example build.xml files) to 
get it to work properly.


Karl





Re: Some more thoughts on a classloader plug-in style architecture

2010-06-02 Thread Jack Krupansky
Sounds good to me, assuming that LCF remains relatively stable. I am 
presuming that a fair number of people can and will be using LCF for various 
purposes well before actual formal first release anyway. The point being 
that delaying the formal release shouldn't slow most people from using LCF 
as is, provided that any phases of instability are relatively contained, 
which seems to be the normal case here anyway.


-- Jack Krupansky

--
From: karl.wri...@nokia.com
Sent: Wednesday, June 02, 2010 9:00 AM
To: connectors-dev@incubator.apache.org
Subject: RE: Some more thoughts on a classloader plug-in style architecture

I've entered a ticket CONNECTORS-40 for this work.  What I propose is that 
this gets done before first official LCF release, because of the potential 
backwards-compatibility issues involved.  It is, however, quite a heavy 
lift - I can't imagine getting it done in less than a couple of weeks 
straight-out effort.  To minimize the elapsed development time of this 
step, I propose that we not attempt to convert any JSP's from the 
framework itself in this round, but merely those for each connector.  I'll 
start by adding appropriate placeholder methods to the interfaces and all 
connectors, and then try to port the UI components one at a time, for one 
specific connector.  Then, we can have a back-and-forth about how to 
refine that one implementation before I attempt the rest.


Sound reasonable?

Karl


-Original Message-
From: ext Erik Hatcher [mailto:erik.hatc...@gmail.com]
Sent: Wednesday, June 02, 2010 8:51 AM
To: connectors-dev@incubator.apache.org
Subject: Re: Some more thoughts on a classloader plug-in style 
architecture


Yeah, definitely embedded Java (ala JSP's %% stuff) isn't how
Velocity works.  It can call Java code just fine via tools (Java
objects) that are injected into the Velocity context.  Any
sophisticated business logic can be distilled from the existing plugin
JSPs and migrated to Java classes and added to the Velocity context.

If specific connectors need more logic, they could ship with their own
tools, and we build in a mechanism to pull in custom tools into the
context.

Erik

On Jun 2, 2010, at 8:28 AM, karl.wri...@nokia.com karl.wri...@nokia.com
 wrote:


I've just spelunked through what I could find online, and it seems
at least plausible to use Velocity for various LCF HTML templating
needs.  The major concern that I have is that the mix of inline java
to HTML in the LCF stuff is weighted heavily towards inline java -
which doesn't seem to be where Velocity's sweet spot is. ;-)  The
general underlying idea of invoking a connector-API-based method
instead of providing a JSP seems sound, however, no matter what the
templating engine is that does the final assembly.

Karl


-Original Message-
From: ext Erik Hatcher [mailto:erik.hatc...@gmail.com]
Sent: Wednesday, June 02, 2010 8:19 AM
To: connectors-dev@incubator.apache.org
Subject: Re: Some more thoughts on a classloader plug-in style
architecture

Velocity is just a simple templating engine, so it could be used in
the intermediate fashion to produce only snippets that mesh into the
rest of the built-in UI, no problem.

Erik



On Jun 2, 2010, at 8:04 AM, karl.wri...@nokia.com 
karl.wri...@nokia.com

wrote:
Does the entire UI have to be converted to Velocity for this
approach to work?  There's an intermediate path that would involve
converting only the connector portions, which might be viable.

Karl


-Original Message-
From: ext Erik Hatcher [mailto:erik.hatc...@gmail.com]
Sent: Wednesday, June 02, 2010 7:27 AM
To: connectors-dev@incubator.apache.org
Subject: Re: Some more thoughts on a classloader plug-in style
architecture

Actually the problem is quite tractable.  We switch the UI over to
Velocity templates (like Solr's VelocityResponseWriter, for example)
and embed the UI bits into plugin JAR files that can live externally.

JSPs aren't really workable in this fashion.

Velocity templates can be loaded from the file system, the classpath,
from a String, or from thin air.  For example, VelocityResponseWriter
allows templates to load from the actual request (clients can send in
a template as an HTTP parameter), if not found it looks on the
filesystem, and if not found it looks in the classpath.

Erik

On Jun 2, 2010, at 6:55 AM, Jack Krupansky wrote:


Good point. LCF is a bit more complex than Solr in that sense.

Maybe a separate class is needed that has methods to retrieve the
crawl and UI components of a connector.

Or a small XML file with whatever info about the connector is
needed. Or maybe it is simple enough for a properties file.

Or maybe just a naming convention so that the name of the UI
component can be deduced given the logical name of a connector.

-- Jack Krupansky

--
From: karl.wri...@nokia.com
Sent: Wednesday, June 02, 2010 6:45 AM
To: connectors-dev

[jira] Commented: (CONNECTORS-37) LCF should use an XML configuration file, not the simple name/value config file it currently has

2010-06-01 Thread Jack Krupansky (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-37?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12874029#action_12874029
 ] 

Jack Krupansky commented on CONNECTORS-37:
--

I'll defer to the community on the logging issue, other than to simply say that 
it should be as standard as possible and relatively compatible with how Solr 
does logging so that it will not surprise people.

I don't have a problem with the LCF .properties file per se, other than the 
fact that since it is restricted to being strictly keyword/value pairs it 
cannot contain more complex, structured configuration information.

The main thing I'd like to see is that the current executecommand 
configuration setup, such as which output connectors and crawlers to register, 
be done using descriptions in a config file rather than discrete shell commands 
to manually execute. The default config file from svn checkout should have a 
default set of connectors, crawlers, etc., and have commented-out entries for 
other connectors that people can un-comment and edit as desired.

A key advantage of having such a config file is that when people do report 
problems here we can ask them to provide their config file rather than ask them 
to try to remember and re-type whatever commands they might remember that they 
intended to type.

Whether connections and jobs can be initially created from a config file is a 
larger discussion. The main point here is simply that it be easy to get LCF 
initialized and configured for the really basic stuff needed for a typical 
initial evaluation (comparable to what occurs in a Solr tutorial.) The 
proverbial zero-hour experience.


 LCF should use an XML configuration file, not the simple name/value config 
 file it currently has
 

 Key: CONNECTORS-37
 URL: https://issues.apache.org/jira/browse/CONNECTORS-37
 Project: Lucene Connector Framework
  Issue Type: Improvement
  Components: Framework core
Reporter: Karl Wright

 LCF's configuration file is limited in what it can specify, and XML 
 configuration files seem to offer more flexibility and are the modern norm.  
 Before backwards compatibility becomes an issue, it may therefore be worth 
 converting the property file reader to use XML rather than name/value format. 
  It would also be nice to be able to fold the logging configuration into the 
 same file, if this seems possible.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Re: Proposal for simple LCF deployment model

2010-05-28 Thread Jack Krupansky
(b) The alternative starting point should probably autocreate the 
database,
and should also autoregister all connectors.  This will require a list, 
somewhere,
of the connectors and authorities that are included, and their preferred 
UI

names for that installation.  This could come from the configuration
information, or from some other place.  Any ideas?


I would like to see two things: 1) A way to request LCF to dump all 
configuration parameters, including parameters for all output connections, 
repositories,  jobs, et al to an LCF config file, and 2) The ability to 
start from scratch with a fresh deployment of LCF and feed it that config 
file to then create all of the output connections, repository connections, 
and jobs to match the LCF configuration state desired.


Now, whether that config file is simple XML ala solrconfig.xml can be a 
matter for debate. Whether it is a separate file from the current config 
file can also be a matter for debate.


But, in short, the answer to your question would be that there would be an 
LCF config file (not just the simple keyword/value file that LCF has for 
global configuration settings) to see the initial output connections, 
repository connections, et al.


Maybe this config file is a little closer to the Solr schema file. I think 
it feels that way. OTOH, the list of registered connectors, as opposed to 
the user-created connections that use those connectors, seems more like Solr 
request handlers that are in solrconfig.xml, so maybe the initial 
configuration would be split into two separate files as in Solr. Or, 
maybe, the Solr guys have a better proposal for how they would have managed 
that split in Solr if they had it to do all over again. My preference would 
be one file for the whole configuration.


Another advantage of such a config file is that it is easier for people to 
post problem reports that show exactly how they set up LCF.


-- Jack Krupansky

--
From: karl.wri...@nokia.com
Sent: Friday, May 28, 2010 5:48 AM
To: connectors-dev@incubator.apache.org
Subject: Proposal for simple LCF deployment model

The current LCF standard deployment model requires a number of moving 
parts, which are probably necessary in some cases, but simply introduce 
complexity in others.  It has occurred to me that it may be possible to 
provide an alternate deployment model involving Jetty, which would reduce 
the number of moving parts by one (by eliminating Tomcat).  A simple LCF 
deployment could then, in principle, look pretty much like Solr's.


In order for this to work, the following has to be true:

(1) jetty's basic JSP support must be comparable to Tomcat's.
(2) the class loader that jetty uses for webapp's must provide class 
isolation similar to Tomcat's.  If this condition is not met, we'd need to 
build both a Tomcat and a Jetty version of each webapp.


The overall set of changes that would be required would be the following:
(a) An alternative start entry point would need to be coded, which would 
start Jetty running the lcf-crawler-ui and lcf-authority-service webapps 
before bringing up the agents engine.
(b) The alternative starting point should probably autocreate the 
database, and should also autoregister all connectors.  This will require 
a list, somewhere, of the connectors and authorities that are included, 
and their preferred UI names for that installation.  This could come from 
the configuration information, or from some other place.  Any ideas?
(c) There would need to an additional jar produced by the build process, 
which would be the equivalent of the solr start.jar, so as to make running 
the whole stack trivial.
(d) An LCF API web application, which provides access to all of the 
current LCF commands, would also be an obvious requirement to go forward 
with this model.


What are the disadvantages?  Well, I think that the main problem would be 
security.  This deployment model, though simple, does not control access 
to LCF is any way.  You'd need to introduce another moving part to do 
that.


Bear in mind that this change would still not allow LCF to run using only 
one process.  There are still separate RMI-based processes needed for some 
connectors (Documentum and FileNet).  Although these could in theory be 
started up using Java Activation, a main reason for a separate process in 
Documentum's case is that DFC randomly crashes the JVM under which it 
runs, and thus needs to be independently restarted if and when it dies. 
If anyone has experience with Java Activation and wants to contribute 
their time to develop infrastructure that can deal with that problem, 
please let me know.


Finally, there is no way around the fact that LCF requires a 
well-performing database, which constitutes an independent moving part of 
its own.  This proposal does nothing to change that at all.


Please note that I'm not proposing that the current model go away, but 
rather that we support

Re: Proposal for simple LCF deployment model

2010-05-28 Thread Jack Krupansky
But for a basic, early evaluation, test drive, just the file system and 
web repository connectors should be sufficient. And if there is a clean 
database abstraction, a basic database package (e.g., derby) should be 
sufficient for such a basic evaluation.


Are there technical reasons why third-party repository connectors cannot be 
supported using a Solr-style plug-in approach? Or, worst case, as separate 
processes with a clean inter-process API? Maybe not in the near-term, but as 
a longer-term vision.


-- Jack Krupansky

--
From: karl.wri...@nokia.com
Sent: Friday, May 28, 2010 11:10 AM
To: connectors-dev@incubator.apache.org
Subject: Re: Proposal for simple LCF deployment model

You forget that building lcf in its entirety requires that you supply 
proprietary client components from third-party vendors.  So i think it is 
unrealistic to expect canned builds that contain everything that you just 
deploy.  For lcf i think the build cycle will thus be very common.


Getting rid of the database requirement is also obviously not an option.

Karl

--- original message ---
From: ext Jack Krupansky jack.krupan...@lucidimagination.com
Subject: Re: Proposal for simple LCF deployment model
Date: May 28, 2010
Time: 10:42:17  AM


A simple deployment ala Solr is a good goal. Integrating Jetty with the 
LCF

deployment will go a long way towards that goal. The database software
deployment (PostgreSQL) is the other half of the hassle with deploying 
LCF.


I think there are three distinct goals here: 1) A super-easy Solr-style
deployment for initial evaluation of LCF, 2) deployment of the LCF
components for full-blown application development where app server and
database might need to be different from the initial evaluation, and 3)
deployment of LCF components for production deployment of the full
application.

Right now, evaluation of LCF requires deployment of the source code and
building artifacts - Solr evaluation does not require that step. 
Eliminated

the source and build step will certainly help simplify the evaluation
process.

Another possible consideration is that although some of us are especially
interested in integration with Solr and doing so easily and robustly, Solr
is just one of the output connections and LCF could be deployed for
applications that do not involve Solr at all. So, maybe there should be an
extra deployment wiki page for Solr guys that focuses on use of LCF with
Solr and related issues. Whether that should be the default presentation 
in

the doc is a matter for debate. Right now, I see no harm with a Solr bias.
At least it is a convenient way to demonstrate end-to-end use of LCF.

-- Jack Krupansky

--
From: karl.wri...@nokia.com
Sent: Friday, May 28, 2010 5:48 AM
To: connectors-dev@incubator.apache.org
Subject: Proposal for simple LCF deployment model


The current LCF standard deployment model requires a number of moving
parts, which are probably necessary in some cases, but simply introduce
complexity in others.  It has occurred to me that it may be possible to
provide an alternate deployment model involving Jetty, which would reduce
the number of moving parts by one (by eliminating Tomcat).  A simple LCF
deployment could then, in principle, look pretty much like Solr's.

In order for this to work, the following has to be true:

(1) jetty's basic JSP support must be comparable to Tomcat's.
(2) the class loader that jetty uses for webapp's must provide class
isolation similar to Tomcat's.  If this condition is not met, we'd need 
to

build both a Tomcat and a Jetty version of each webapp.

The overall set of changes that would be required would be the following:
(a) An alternative start entry point would need to be coded, which 
would

start Jetty running the lcf-crawler-ui and lcf-authority-service webapps
before bringing up the agents engine.
(b) The alternative starting point should probably autocreate the
database, and should also autoregister all connectors.  This will require
a list, somewhere, of the connectors and authorities that are included,
and their preferred UI names for that installation.  This could come from
the configuration information, or from some other place.  Any ideas?
(c) There would need to an additional jar produced by the build process,
which would be the equivalent of the solr start.jar, so as to make 
running

the whole stack trivial.
(d) An LCF API web application, which provides access to all of the
current LCF commands, would also be an obvious requirement to go forward
with this model.

What are the disadvantages?  Well, I think that the main problem would be
security.  This deployment model, though simple, does not control access
to LCF is any way.  You'd need to introduce another moving part to do
that.

Bear in mind that this change would still not allow LCF to run using only
one process.  There are still separate RMI-based processes needed

Re: Proposal for simple LCF deployment model

2010-05-28 Thread Jack Krupansky

The use cases I was considering for database issues are:

1) Desire for a very simple evaluation install process. See the Solr 
tutorial.
2) Desire for less complex and faster application deployment install 
process. PostgreSQL has a reputation for having a large footprint.


Now, as machines and software evolve, it is not completely clear to me how 
bad PostgreSQL is these days, but having a separate deployment step to 
accommodate PostgreSQL interferes with use case #1.


That said, I am not sure that I would hold up getting the first official 
release of LCF out the door. After all, leading-edge (bleeding-edge) users 
are used to more than a little inconvenience. Still, a Solr-simple 
evaluation install would be... sweet.


-- Jack Krupansky

--
From: karl.wri...@nokia.com
Sent: Friday, May 28, 2010 2:17 PM
To: connectors-dev@incubator.apache.org
Subject: RE: Proposal for simple LCF deployment model

I've been fighting with Derby for two days.  It's missing a significant 
amount of important functionality, and its user and database model are 
radically different from all other databases I know of.  (I'm also getting 
nonsense exceptions from it, but that's another matter.)  So regardless of 
how good the database abstraction layer is, expecting all databases to 
have sufficient functionality to get anything done is ridiculous.  If I 
get Derby working, I will let you know whether it is feasible at all to 
run LCF on in under any circumstances or not, but that *cannot* be the 
primary database people use with this project.  I'm also still waiting for 
a use-case from you as to how getting rid of the Postgresql database makes 
your life easier at all - and if your use case involves using Derby for 
anything serious, I'll have to say that I don't think that's realistic.


LCF has a very clean connector abstraction today.  So all we're really 
talking about is the build process here - whether it is possible to 
separate build and deployment of the framework and some connectors from 
the builds of other connectors.  Having each connector run as a separate 
process seems like overkill and would also impact performance pretty 
dramatically, as well as requiring quite a bit of additional 
configuration.  The Solr plug-in model is a bit better and requires only 
the addition of a custom classloader that explicitly loads any plugin 
classes and any classes that those use.  The required defines that some 
libraries need would have to be solved, but that needs doing anyway and I 
think I can have individual connectors set these as needed.


Karl



-Original Message-
From: ext Jack Krupansky [mailto:jack.krupan...@lucidimagination.com]
Sent: Friday, May 28, 2010 1:49 PM
To: connectors-dev@incubator.apache.org
Subject: Re: Proposal for simple LCF deployment model

But for a basic, early evaluation, test drive, just the file system and
web repository connectors should be sufficient. And if there is a clean
database abstraction, a basic database package (e.g., derby) should be
sufficient for such a basic evaluation.

Are there technical reasons why third-party repository connectors cannot 
be
supported using a Solr-style plug-in approach? Or, worst case, as 
separate
processes with a clean inter-process API? Maybe not in the near-term, but 
as

a longer-term vision.

-- Jack Krupansky

--
From: karl.wri...@nokia.com
Sent: Friday, May 28, 2010 11:10 AM
To: connectors-dev@incubator.apache.org
Subject: Re: Proposal for simple LCF deployment model


You forget that building lcf in its entirety requires that you supply
proprietary client components from third-party vendors.  So i think it is
unrealistic to expect canned builds that contain everything that you just
deploy.  For lcf i think the build cycle will thus be very common.

Getting rid of the database requirement is also obviously not an option.

Karl

--- original message ---
From: ext Jack Krupansky jack.krupan...@lucidimagination.com
Subject: Re: Proposal for simple LCF deployment model
Date: May 28, 2010
Time: 10:42:17  AM


A simple deployment ala Solr is a good goal. Integrating Jetty with the
LCF
deployment will go a long way towards that goal. The database software
deployment (PostgreSQL) is the other half of the hassle with deploying
LCF.

I think there are three distinct goals here: 1) A super-easy Solr-style
deployment for initial evaluation of LCF, 2) deployment of the LCF
components for full-blown application development where app server and
database might need to be different from the initial evaluation, and 3)
deployment of LCF components for production deployment of the full
application.

Right now, evaluation of LCF requires deployment of the source code and
building artifacts - Solr evaluation does not require that step.
Eliminated
the source and build step will certainly help simplify the evaluation
process.

Another possible consideration

Re: Proposal for simple LCF deployment model

2010-05-28 Thread Jack Krupansky
I meant the lcf.agents.RegisterOutput org.apache.lcf.agents.output.* and 
lcf.crawler.Register org.apache.lcf.crawler.connectors.* types of operations 
that are currently executed as standalone commands, as well as the 
connections created using the UI. So, you would have config file entries for 
both the registration of connector classes and the definition of the 
actual connections in some new form of config file. Sure, the connector 
registration initializes the database, but it is all part of the collection 
of operations that somebody has to perform to go from scratch to an LCF 
configuration that is ready to Start a crawl. Better to have one (or two 
or three if necessary) config file that encompasses the entire 
configuration setup rather than separate manual steps.


Whether it is high enough priority for the first release is a matter for 
debate.


-- Jack Krupansky

--
From: karl.wri...@nokia.com
Sent: Friday, May 28, 2010 11:16 AM
To: connectors-dev@incubator.apache.org
Subject: Re: Proposal for simple LCF deployment model


Dump and restore functionality already exists, but the format is not xml.

Providing and xml dump and restore is straightforward.  Making such a file 
operate like a true config file is not.


This, by the way, has nothing to do with registering connectors, which is 
a datatbase initialization operation.


Karl

--- original message ---
From: ext Jack Krupansky jack.krupan...@lucidimagination.com
Subject: Re: Proposal for simple LCF deployment model
Date: May 28, 2010
Time: 10:33:34  AM



(b) The alternative starting point should probably autocreate the
database,
and should also autoregister all connectors.  This will require a list,
somewhere,
of the connectors and authorities that are included, and their preferred
UI
names for that installation.  This could come from the configuration
information, or from some other place.  Any ideas?


I would like to see two things: 1) A way to request LCF to dump all
configuration parameters, including parameters for all output connections,
repositories,  jobs, et al to an LCF config file, and 2) The ability to
start from scratch with a fresh deployment of LCF and feed it that config
file to then create all of the output connections, repository connections,
and jobs to match the LCF configuration state desired.

Now, whether that config file is simple XML ala solrconfig.xml can be a
matter for debate. Whether it is a separate file from the current config
file can also be a matter for debate.

But, in short, the answer to your question would be that there would be an
LCF config file (not just the simple keyword/value file that LCF has for
global configuration settings) to see the initial output connections,
repository connections, et al.

Maybe this config file is a little closer to the Solr schema file. I think
it feels that way. OTOH, the list of registered connectors, as opposed to
the user-created connections that use those connectors, seems more like 
Solr

request handlers that are in solrconfig.xml, so maybe the initial
configuration would be split into two separate files as in Solr. Or,
maybe, the Solr guys have a better proposal for how they would have 
managed
that split in Solr if they had it to do all over again. My preference 
would

be one file for the whole configuration.

Another advantage of such a config file is that it is easier for people to
post problem reports that show exactly how they set up LCF.

-- Jack Krupansky

--
From: karl.wri...@nokia.com
Sent: Friday, May 28, 2010 5:48 AM
To: connectors-dev@incubator.apache.org
Subject: Proposal for simple LCF deployment model


The current LCF standard deployment model requires a number of moving
parts, which are probably necessary in some cases, but simply introduce
complexity in others.  It has occurred to me that it may be possible to
provide an alternate deployment model involving Jetty, which would reduce
the number of moving parts by one (by eliminating Tomcat).  A simple LCF
deployment could then, in principle, look pretty much like Solr's.

In order for this to work, the following has to be true:

(1) jetty's basic JSP support must be comparable to Tomcat's.
(2) the class loader that jetty uses for webapp's must provide class
isolation similar to Tomcat's.  If this condition is not met, we'd need 
to

build both a Tomcat and a Jetty version of each webapp.

The overall set of changes that would be required would be the following:
(a) An alternative start entry point would need to be coded, which 
would

start Jetty running the lcf-crawler-ui and lcf-authority-service webapps
before bringing up the agents engine.
(b) The alternative starting point should probably autocreate the
database, and should also autoregister all connectors.  This will require
a list, somewhere, of the connectors and authorities that are included,
and their preferred UI names