Looking into the payload patch is next on my list after I finish up
the benchmarking, but I know the payload patch is going to require
several sets of eyes and some discussion. I would like to see the
more flexible index format addressed and I am not sure about the
removal of Fieldable as an interface, but I admit I haven't looked
into the depths just yet. The reason I like Fieldable as an
interface is I think it keeps us open to having Documents be backed
by databases, etc. (I have been toying with the notion of DBFieldable
which is a Fieldable implementation backed by a DB for seamless usage
with a DB when creating Documents, but it isn't ready yet)
As for management, I think we could benefit some from getting some
people to work toward a set of goals for improvements. I can think
of at least 3 major things (and I am sure others could add theirs)
that would have significant impact:
1. Flexible Indexing
2. Java 1.5
3. Merge improvements suggested by Dave Balmain and Marvin Humphrey
concerning segments and also storing vector information outside of
the merge segments.
We have some planning docs on the Wiki, but I don't think anyone has
really taken up that torch.
-Grant
On Oct 17, 2006, at 9:12 AM, Erik Hatcher wrote:
Steven,
Thanks for this prod. I've been meaning to debrief the Lucene dev
group on a recent visit I made to IBM's Silicon Valley Labs where I
presented Lucene and met with their search gurus. A recurring
theme from them was the management of Java Lucene project, and what
they and we can do to better have patches applied and move Lucene
forward.
For example, the payload design proposed by one of the SVL team
members is something they really need to accomplish clever XML
indexing, and we've not done anything about it even though the
proposal was backwards compatible and added something valuable to
Lucene. Granted, that did open the discussion up into a more
generalized "flexible index format" topic, but that has yet to
become real code.
I am all for us adopting the JIRA techniques that Hadoop uses, and
for us committers (and contributors) to become more efficient and
effective with patch and release management.
Now all we need are volunteers to facilitate this. I'd love to
help more, but I'm much too frazzled and swamped to manage
something like this myself currently. But I strongly support it!
Erik
On Oct 17, 2006, at 8:19 AM, Steven Parkes wrote:
As a member of a number of Lucene subprojects dev lists, I've been
comparing the way the Jira workflow is used on the different
projects.
In particular, I've been noting the difference between the workflows
that Lucene Java and Hadoop use. Hadoop in particular has a state
for
"patch submitted" which, as it's used on Hadoop, seems to facilitate
communication. State changes (open->patch submitted,
patch-submitted->open) seem to help communications between
contributors
and reviewers. Looking at the Lucene Java Jira, sometimes
"[patch]" is
put at the beginning of the description of the issue to indicate
something similar, but this isn't used too consistently and
doesn't seem
to be as effective. It also requires custom filters to easily see all
issues in the "patch submitted" state.
Is there sufficient interest to consider this for Lucene Java? (I'd
write "any interest", but since I'm interested, there's at least
some.)
There are a couple of issues around the Hadoop workflow I'm aware of.
One is that once an issue is closed, it can't be reopened. As I
understand it, this is because on Hadoop, they use the Jira feature
which allows automated generation of release notes. As someone who is
responsible for tracking the changes between releases for my company,
this is actually a win for me, so I like the way Hadoop does it.
But it
does add the step of needing to start a new Jira issue rather than
just
reopening an old one.
The other thing I was thinking of was the case where we say "if
you're
working on something, go ahead and submit a patch even if it's not
polished or you aren't sure you want it to be a candidate for the
trunk.
Let others look at it." I think that's clearly a good thing to
have, but
I wonder what the best way to handle it in Jira is. What state
should be
used?
There are quite a few open issues in Jira. Makes me a little
uncomfortable, since in my experience, when you get really long
lists of
issues that are not addressed, the effectiveness of an issue tracking
system seems to drop dramatically. Anybody else think this? I've got
time to contribute to cleanup if there's sufficient interest.
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
--------------------------
Grant Ingersoll
Sr. Software Engineer
Center for Natural Language Processing
Syracuse University
335 Hinds Hall
Syracuse, NY 13244
http://www.cnlp.org
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]