Re: RoadMap?

2020-08-28 Thread Gus Heck
As amusing as the pattern has been (6.6, 7.7, 8.8?) we don't actually have
to release 9 point releases (9.9) before 10.0 :). I'd advocate that some
things we don't feel we can remove in 9.0 hang on for a few point releases
and when we're ready to ditch them we declare 10.0. if that's after 9.2 or
9.3 that's fine...

I agree that to drop support for something major we should hold a [VOTE].
Additionally I suggest that the vote thread should include specification of
the timeline (in terms of releases) for deprecate/removal

-Gus

On Fri, Aug 28, 2020 at 10:50 AM Jan Høydahl  wrote:

> Noble, we all agree on those principles and that direction.
>
> But 9.0 is not the last chance to remove things. I think we must decide on
> a feature-by feature basis:
> - Whether the feature should remain a ASF maintained feature or not
> - If yes, we should make it into a 1st party package distributed in our
> own repo
> - If no, we must decide what is the right time to remove it from the
> distro.
>- If an alternative package already exists, it can be removed in next
> major
>- If not, we must decide how long time our users need to prepare an
> alternative (3rd party pkg or home-grown)
>
> When propose to stop maintaining a feature as part of the project, a
> [VOTE] thread is an excellent way to make such a decision.
>
> Jan
>
> > 28. aug. 2020 kl. 14:35 skrev Noble Paul :
> >
> > We do not have to provide all features. Whatever feature we provide,
> > it should be reasonably bug free, performant and stable.
> >
> > There is no point in carrying around a lot of baggage if we are barely
> > able to carry it. There are a lot of "dark areas" in Solr which nobody
> > pays attention to. Those features should be removed altogether. If
> > there are committers who wish to actively support it , we can maintain
> > them in packages. If, not we should euthanize them gracefully
> >
> > On Fri, Aug 28, 2020 at 5:43 PM Ishan Chattopadhyaya
> >  wrote:
> >>
> >>> The consensus from yesterday seems to be that stuff with a released
> replacement can/should be removed in 9.0, but for CDCR and SolrCell, where
> the proejct still wants to provide a better alternative, they can remain
> deprecated 9.x.
> >>
> >> I disagree that "project still wants to provide a better alternative",
> at least for CDCR. There is no movement in that direction. Part of the
> reason people take supporting these features seriously is the threat or
> deprecation/removal (e.g. HDFS, Velocity, DIH, Autoscaling etc.). The
> moment we deprecate/remove SolrCell, we will see the better alternatives
> emerge. And both of them must be removed, even if better alternatives do
> not emerge. They both must be removed in 9.0. Let us not carry the burden
> into another major release.
> >>
> >> On Fri, Aug 28, 2020 at 12:49 PM Jan Høydahl 
> wrote:
> >>>
> >>> Hi,
> >>>
> >>> I phrased that sentence in the roadmap Wiki, but I think the wording
> is more conservative than need-be. The intent was really to avoid a
> situation where 9.0 goes out the door «tomorrow» without a replacement for
> a popular feature that the community really wants. I attempted a re-phrase
> of that sentence after the meeting yesterday, but did not immediately find
> a better wording.
> >>>
> >>> Personally I think a deprecation in 8.6 can be removed in 9.0
> (there’ll be several months and 2’ish releases in between) if it has a well
> known, released replacement/package. And let’s link to those packages in
> ref-guide and link to the ref-guide from the release-note. I.e. ref-guide
> currently ways DIH is to be removed, perhaps that page could instead
> explain how to obtain the package, and at the same time encourage users to
> contribute to maintaining it?
> >>>
> >>> The consensus from yesterday seems to be that stuff with a released
> replacement can/should be removed in 9.0, but for CDCR and SolrCell, where
> the proejct still wants to provide a better alternative, they can remain
> deprecated 9.x. In particular for SolrCell we can’t imagine how many users
> it has out there. Even after inventing its successor based on TikaServer,
> integrated in SolrJ or whatever, I would advocate for the good-old
> ExtractingRequestHandler to be available as a package for a few releases to
> come.
> >>>
> >>> Wrt whether something could be removed in 9.1 as long as it was
> deprecated in 8.x, I would initially say YES, at least legally/technically.
> We’re not breaking any back-compat promise as long as it has been
> prominently flagged as deprecated for so long. However, I can see

Re: RoadMap?

2020-08-28 Thread Jan Høydahl
Noble, we all agree on those principles and that direction.

But 9.0 is not the last chance to remove things. I think we must decide on a 
feature-by feature basis:
- Whether the feature should remain a ASF maintained feature or not
- If yes, we should make it into a 1st party package distributed in our own repo
- If no, we must decide what is the right time to remove it from the distro. 
   - If an alternative package already exists, it can be removed in next major
   - If not, we must decide how long time our users need to prepare an 
alternative (3rd party pkg or home-grown)

When propose to stop maintaining a feature as part of the project, a [VOTE] 
thread is an excellent way to make such a decision.

Jan

> 28. aug. 2020 kl. 14:35 skrev Noble Paul :
> 
> We do not have to provide all features. Whatever feature we provide,
> it should be reasonably bug free, performant and stable.
> 
> There is no point in carrying around a lot of baggage if we are barely
> able to carry it. There are a lot of "dark areas" in Solr which nobody
> pays attention to. Those features should be removed altogether. If
> there are committers who wish to actively support it , we can maintain
> them in packages. If, not we should euthanize them gracefully
> 
> On Fri, Aug 28, 2020 at 5:43 PM Ishan Chattopadhyaya
>  wrote:
>> 
>>> The consensus from yesterday seems to be that stuff with a released 
>>> replacement can/should be removed in 9.0, but for CDCR and SolrCell, where 
>>> the proejct still wants to provide a better alternative, they can remain 
>>> deprecated 9.x.
>> 
>> I disagree that "project still wants to provide a better alternative", at 
>> least for CDCR. There is no movement in that direction. Part of the reason 
>> people take supporting these features seriously is the threat or 
>> deprecation/removal (e.g. HDFS, Velocity, DIH, Autoscaling etc.). The moment 
>> we deprecate/remove SolrCell, we will see the better alternatives emerge. 
>> And both of them must be removed, even if better alternatives do not emerge. 
>> They both must be removed in 9.0. Let us not carry the burden into another 
>> major release.
>> 
>> On Fri, Aug 28, 2020 at 12:49 PM Jan Høydahl  wrote:
>>> 
>>> Hi,
>>> 
>>> I phrased that sentence in the roadmap Wiki, but I think the wording is 
>>> more conservative than need-be. The intent was really to avoid a situation 
>>> where 9.0 goes out the door «tomorrow» without a replacement for a popular 
>>> feature that the community really wants. I attempted a re-phrase of that 
>>> sentence after the meeting yesterday, but did not immediately find a better 
>>> wording.
>>> 
>>> Personally I think a deprecation in 8.6 can be removed in 9.0 (there’ll be 
>>> several months and 2’ish releases in between) if it has a well known, 
>>> released replacement/package. And let’s link to those packages in ref-guide 
>>> and link to the ref-guide from the release-note. I.e. ref-guide currently 
>>> ways DIH is to be removed, perhaps that page could instead explain how to 
>>> obtain the package, and at the same time encourage users to contribute to 
>>> maintaining it?
>>> 
>>> The consensus from yesterday seems to be that stuff with a released 
>>> replacement can/should be removed in 9.0, but for CDCR and SolrCell, where 
>>> the proejct still wants to provide a better alternative, they can remain 
>>> deprecated 9.x. In particular for SolrCell we can’t imagine how many users 
>>> it has out there. Even after inventing its successor based on TikaServer, 
>>> integrated in SolrJ or whatever, I would advocate for the good-old 
>>> ExtractingRequestHandler to be available as a package for a few releases to 
>>> come.
>>> 
>>> Wrt whether something could be removed in 9.1 as long as it was deprecated 
>>> in 8.x, I would initially say YES, at least legally/technically. We’re not 
>>> breaking any back-compat promise as long as it has been prominently flagged 
>>> as deprecated for so long. However, I can see how people not reading 
>>> documentation downloads 9.0.0, starts using a deprecated feature and then 
>>> complains when it is gone in 9.3 :)
>>> 
>>> We also have an option to release Solr 10.0 (Solr X) sooner rather than 
>>> later (even on Lucene 9.x). Looks like we have tons of major goodies lined 
>>> up - it won’t all need to land in 9.0. Guess that’s what the Roadmap page 
>>> is there for. So as David says, let’s start placing the removal JIRAs into 
>>> the roadmap page and see if 

Re: RoadMap?

2020-08-28 Thread Noble Paul
We do not have to provide all features. Whatever feature we provide,
it should be reasonably bug free, performant and stable.

There is no point in carrying around a lot of baggage if we are barely
able to carry it. There are a lot of "dark areas" in Solr which nobody
pays attention to. Those features should be removed altogether. If
there are committers who wish to actively support it , we can maintain
them in packages. If, not we should euthanize them gracefully

On Fri, Aug 28, 2020 at 5:43 PM Ishan Chattopadhyaya
 wrote:
>
> > The consensus from yesterday seems to be that stuff with a released 
> > replacement can/should be removed in 9.0, but for CDCR and SolrCell, where 
> > the proejct still wants to provide a better alternative, they can remain 
> > deprecated 9.x.
>
> I disagree that "project still wants to provide a better alternative", at 
> least for CDCR. There is no movement in that direction. Part of the reason 
> people take supporting these features seriously is the threat or 
> deprecation/removal (e.g. HDFS, Velocity, DIH, Autoscaling etc.). The moment 
> we deprecate/remove SolrCell, we will see the better alternatives emerge. And 
> both of them must be removed, even if better alternatives do not emerge. They 
> both must be removed in 9.0. Let us not carry the burden into another major 
> release.
>
> On Fri, Aug 28, 2020 at 12:49 PM Jan Høydahl  wrote:
>>
>> Hi,
>>
>> I phrased that sentence in the roadmap Wiki, but I think the wording is more 
>> conservative than need-be. The intent was really to avoid a situation where 
>> 9.0 goes out the door «tomorrow» without a replacement for a popular feature 
>> that the community really wants. I attempted a re-phrase of that sentence 
>> after the meeting yesterday, but did not immediately find a better wording.
>>
>> Personally I think a deprecation in 8.6 can be removed in 9.0 (there’ll be 
>> several months and 2’ish releases in between) if it has a well known, 
>> released replacement/package. And let’s link to those packages in ref-guide 
>> and link to the ref-guide from the release-note. I.e. ref-guide currently 
>> ways DIH is to be removed, perhaps that page could instead explain how to 
>> obtain the package, and at the same time encourage users to contribute to 
>> maintaining it?
>>
>> The consensus from yesterday seems to be that stuff with a released 
>> replacement can/should be removed in 9.0, but for CDCR and SolrCell, where 
>> the proejct still wants to provide a better alternative, they can remain 
>> deprecated 9.x. In particular for SolrCell we can’t imagine how many users 
>> it has out there. Even after inventing its successor based on TikaServer, 
>> integrated in SolrJ or whatever, I would advocate for the good-old 
>> ExtractingRequestHandler to be available as a package for a few releases to 
>> come.
>>
>> Wrt whether something could be removed in 9.1 as long as it was deprecated 
>> in 8.x, I would initially say YES, at least legally/technically. We’re not 
>> breaking any back-compat promise as long as it has been prominently flagged 
>> as deprecated for so long. However, I can see how people not reading 
>> documentation downloads 9.0.0, starts using a deprecated feature and then 
>> complains when it is gone in 9.3 :)
>>
>> We also have an option to release Solr 10.0 (Solr X) sooner rather than 
>> later (even on Lucene 9.x). Looks like we have tons of major goodies lined 
>> up - it won’t all need to land in 9.0. Guess that’s what the Roadmap page is 
>> there for. So as David says, let’s start placing the removal JIRAs into the 
>> roadmap page and see if we’re still on the same page?
>>
>> Jan
>>
>> 28. aug. 2020 kl. 07:43 skrev David Smiley :
>>
>>
>> On Thu, Aug 27, 2020 at 7:03 PM Ishan Chattopadhyaya 
>>  wrote:
>>>
>>> > I find it highly depressing that we can't, *in a major release*, manage 
>>> > to get rid of our deprecations -- particularly for code that has a new 
>>> > home and is packaged in a form that is trivial to install (thanks to our 
>>> > new awesome package manager).
>>>
>>> I'm not sure why you think "we can't". I can't even remember a single 
>>> committer standing in the way of removing those *that already have a 
>>> package*.
>>
>>
>> Okay, maybe I read the intent wrong.  I can see the example given was about 
>> Solr Cell, which apparently has no new home, so I'm +0 with keeping it for 
>> 9.0.
>>
>> Also, on the roadmap cwiki:
>>
>>> We should not remove all featur

Re: RoadMap?

2020-08-28 Thread Ishan Chattopadhyaya
> The consensus from yesterday seems to be that stuff with a released
replacement can/should be removed in 9.0, but for CDCR and SolrCell, where
the proejct still wants to provide a better alternative, they can remain
deprecated 9.x.

I disagree that "project still wants to provide a better alternative", at
least for CDCR. There is no movement in that direction. Part of the reason
people take supporting these features seriously is the threat or
deprecation/removal (e.g. HDFS, Velocity, DIH, Autoscaling etc.). The
moment we deprecate/remove SolrCell, we will see the better alternatives
emerge. And both of them must be removed, even if better alternatives do
not emerge. They both must be removed in 9.0. Let us not carry the burden
into another major release.

On Fri, Aug 28, 2020 at 12:49 PM Jan Høydahl  wrote:

> Hi,
>
> I phrased that sentence in the roadmap Wiki, but I think the wording is
> more conservative than need-be. The intent was really to avoid a situation
> where 9.0 goes out the door «tomorrow» without a replacement for a popular
> feature that the community really wants. I attempted a re-phrase of that
> sentence after the meeting yesterday, but did not immediately find a better
> wording.
>
> Personally I think a deprecation in 8.6 can be removed in 9.0 (there’ll be
> several months and 2’ish releases in between) if it has a well known,
> released replacement/package. And let’s link to those packages in ref-guide
> and link to the ref-guide from the release-note. I.e. ref-guide
> <https://nightlies.apache.org/Lucene/Solr-reference-guide-master/uploading-structured-data-store-data-with-the-data-import-handler.html>
>  currently
> ways DIH is to be removed, perhaps that page could instead explain how to
> obtain the package, and at the same time encourage users to contribute to
> maintaining it?
>
> The consensus from yesterday seems to be that stuff with a released
> replacement can/should be removed in 9.0, but for CDCR and SolrCell, where
> the proejct still wants to provide a better alternative, they can remain
> deprecated 9.x. In particular for SolrCell we can’t imagine how many users
> it has out there. Even after inventing its successor based on TikaServer,
> integrated in SolrJ or whatever, I would advocate for the good-old
> ExtractingRequestHandler to be available as a package for a few releases to
> come.
>
> Wrt whether something could be removed in 9.1 as long as it was deprecated
> in 8.x, I would initially say YES, at least legally/technically. We’re not
> breaking any back-compat promise as long as it has been prominently flagged
> as deprecated for so long. However, I can see how people not reading
> documentation downloads 9.0.0, starts using a deprecated feature and then
> complains when it is gone in 9.3 :)
>
> We also have an option to release Solr 10.0 (Solr X) sooner rather than
> later (even on Lucene 9.x). Looks like we have tons of major goodies lined
> up - it won’t all need to land in 9.0. Guess that’s what the Roadmap page
> is there for. So as David says, let’s start placing the removal JIRAs into
> the roadmap page and see if we’re still on the same page?
>
> Jan
>
> 28. aug. 2020 kl. 07:43 skrev David Smiley :
>
>
> On Thu, Aug 27, 2020 at 7:03 PM Ishan Chattopadhyaya <
> ichattopadhy...@gmail.com> wrote:
>
>> > I find it highly depressing that we can't, *in a major release*, manage
>> to get rid of our deprecations -- particularly for code that has a new home
>> and is packaged in a form that is trivial to install (thanks to our new
>> awesome package manager).
>>
>> I'm not sure why you think "we can't". I can't even remember a single
>> committer standing in the way of removing those *that already have a
>> package*.
>>
>
> Okay, maybe I read the intent wrong.  I can see the example given was
> about Solr Cell, which apparently has no new home, so I'm +0 with keeping
> it for 9.0.
>
> Also, on the roadmap cwiki:
>
> We should *not* remove all features/APIs deprecated in 8.x yet, to give
>> users a path to upgrade to 9.x without all the extra noise. Deprecated
>> features can be removed in a later 9.x release, when the new alternative is
>> solid and well known.
>
>
> Again, maybe I'm misreading but I'd like to us to manage to remove a lot
> of deprecated stuff *as the norm*.  There will be exceptions to the norm
> -- Solr Cell, CDCR.  To make this point clear, I wish to add to the
> roadmap, Solr 9.0 table, first row, saying basically "Remove lots of
> deprecated stuff" with some JIRAs linked like
> https://issues.apache.org/jira/browse/SOLR-13138
>
> ~ David
>
>
>


Re: RoadMap?

2020-08-28 Thread Jan Høydahl
Hi,

I phrased that sentence in the roadmap Wiki, but I think the wording is more 
conservative than need-be. The intent was really to avoid a situation where 9.0 
goes out the door «tomorrow» without a replacement for a popular feature that 
the community really wants. I attempted a re-phrase of that sentence after the 
meeting yesterday, but did not immediately find a better wording.

Personally I think a deprecation in 8.6 can be removed in 9.0 (there’ll be 
several months and 2’ish releases in between) if it has a well known, released 
replacement/package. And let’s link to those packages in ref-guide and link to 
the ref-guide from the release-note. I.e. ref-guide 
<https://nightlies.apache.org/Lucene/Solr-reference-guide-master/uploading-structured-data-store-data-with-the-data-import-handler.html>
 currently ways DIH is to be removed, perhaps that page could instead explain 
how to obtain the package, and at the same time encourage users to contribute 
to maintaining it?

The consensus from yesterday seems to be that stuff with a released replacement 
can/should be removed in 9.0, but for CDCR and SolrCell, where the proejct 
still wants to provide a better alternative, they can remain deprecated 9.x. In 
particular for SolrCell we can’t imagine how many users it has out there. Even 
after inventing its successor based on TikaServer, integrated in SolrJ or 
whatever, I would advocate for the good-old ExtractingRequestHandler to be 
available as a package for a few releases to come.

Wrt whether something could be removed in 9.1 as long as it was deprecated in 
8.x, I would initially say YES, at least legally/technically. We’re not 
breaking any back-compat promise as long as it has been prominently flagged as 
deprecated for so long. However, I can see how people not reading documentation 
downloads 9.0.0, starts using a deprecated feature and then complains when it 
is gone in 9.3 :)

We also have an option to release Solr 10.0 (Solr X) sooner rather than later 
(even on Lucene 9.x). Looks like we have tons of major goodies lined up - it 
won’t all need to land in 9.0. Guess that’s what the Roadmap page is there for. 
So as David says, let’s start placing the removal JIRAs into the roadmap page 
and see if we’re still on the same page?

Jan

> 28. aug. 2020 kl. 07:43 skrev David Smiley :
> 
> 
> On Thu, Aug 27, 2020 at 7:03 PM Ishan Chattopadhyaya 
> mailto:ichattopadhy...@gmail.com>> wrote:
> > I find it highly depressing that we can't, *in a major release*, manage to 
> > get rid of our deprecations -- particularly for code that has a new home 
> > and is packaged in a form that is trivial to install (thanks to our new 
> > awesome package manager). 
> 
> I'm not sure why you think "we can't". I can't even remember a single 
> committer standing in the way of removing those *that already have a 
> package*. 
> 
> Okay, maybe I read the intent wrong.  I can see the example given was about 
> Solr Cell, which apparently has no new home, so I'm +0 with keeping it for 
> 9.0.  
> 
> Also, on the roadmap cwiki:
> 
> We should not remove all features/APIs deprecated in 8.x yet, to give users a 
> path to upgrade to 9.x without all the extra noise. Deprecated features can 
> be removed in a later 9.x release, when the new alternative is solid and well 
> known.
> 
> Again, maybe I'm misreading but I'd like to us to manage to remove a lot of 
> deprecated stuff as the norm.  There will be exceptions to the norm -- Solr 
> Cell, CDCR.  To make this point clear, I wish to add to the roadmap, Solr 9.0 
> table, first row, saying basically "Remove lots of deprecated stuff" with 
> some JIRAs linked like  https://issues.apache.org/jira/browse/SOLR-13138 
> <https://issues.apache.org/jira/browse/SOLR-13138>
> 
> ~ David



Re: RoadMap?

2020-08-27 Thread David Smiley
On Thu, Aug 27, 2020 at 7:03 PM Ishan Chattopadhyaya <
ichattopadhy...@gmail.com> wrote:

> > I find it highly depressing that we can't, *in a major release*, manage
> to get rid of our deprecations -- particularly for code that has a new home
> and is packaged in a form that is trivial to install (thanks to our new
> awesome package manager).
>
> I'm not sure why you think "we can't". I can't even remember a single
> committer standing in the way of removing those *that already have a
> package*.
>

Okay, maybe I read the intent wrong.  I can see the example given was about
Solr Cell, which apparently has no new home, so I'm +0 with keeping it for
9.0.

Also, on the roadmap cwiki:

We should *not* remove all features/APIs deprecated in 8.x yet, to give
> users a path to upgrade to 9.x without all the extra noise. Deprecated
> features can be removed in a later 9.x release, when the new alternative is
> solid and well known.


Again, maybe I'm misreading but I'd like to us to manage to remove a lot of
deprecated stuff *as the norm*.  There will be exceptions to the norm --
Solr Cell, CDCR.  To make this point clear, I wish to add to the roadmap,
Solr 9.0 table, first row, saying basically "Remove lots of deprecated
stuff" with some JIRAs linked like
https://issues.apache.org/jira/browse/SOLR-13138

~ David


Re: RoadMap?

2020-08-27 Thread Alexandre Rafalovitch
 read very carefully through SIP-10 of
> > >> which, this is just a first step.
> > >>
> > >> In general, maybe we can manage to do so many new features and cleanup
> > >> in 9 that will make Solr TLP look like a great Big Bang moment...
> > >>
> > >> And it will probably take a little longer to achieve that, so the -
> > >> effective - deprecation schedule would still be ok.
> > >>
> > >> Regards,
> > >>Alex.
> > >>
> > >> On Thu, 27 Aug 2020 at 18:35, David Smiley  wrote:
> > >> >>
> > >> >> It has been proposed on the list to NOT rip out all deprecations in 
> > >> >> 9.0, but allow users to upgrade to 9.0 with e.g. SolrCell still 
> > >> >> available, and then have yet some time to change their processes to 
> > >> >> adapt to the new way of doing stuff. I like that proposal. Sure, 9.0 
> > >> >> will remove lots of deprecated code, but I think it is a mistake to 
> > >> >> do all of the proposed removals at once. We can spread removals out 
> > >> >> in 9.x releases, after users have had a few releases with a choice 
> > >> >> between old and new and the new alternative is solid.
> > >> >
> > >> >
> > >> > I find it highly depressing that we can't, *in a major release*, 
> > >> > manage to get rid of our deprecations -- particularly for code that 
> > >> > has a new home and is packaged in a form that is trivial to install 
> > >> > (thanks to our new awesome package manager).  I'm sympathetic to 
> > >> > waiting to delete until *after* there is an actual package ready at 
> > >> > that time (rather than just the promise of one).
> > >> >
> > >> > Also, users generally are cautious on performing a major version 
> > >> > upgrade.  There's time.
> > >> >
> > >> > ~ David Smiley
> > >> > Apache Lucene/Solr Search Developer
> > >> > http://www.linkedin.com/in/davidwsmiley
> > >> >
> > >> >
> > >> > On Wed, Aug 12, 2020 at 4:06 AM Jan Høydahl  
> > >> > wrote:
> > >> >>
> > >> >> I edited the page to introduce the (super important) Solr TLP split 
> > >> >> into the roadmap.
> > >> >> Also added a rough timeframe and a «major theme» for each release 
> > >> >> above the issue table.
> > >> >> I added 8.8 and 9.1 as I think it is important to track what gets 
> > >> >> done just before 9.0 and what can be deferred to after 9.0.
> > >> >>
> > >> >> It has been proposed on the list to NOT rip out all deprecations in 
> > >> >> 9.0, but allow users to upgrade to 9.0 with e.g. SolrCell still 
> > >> >> available, and then have yet some time to change their processes to 
> > >> >> adapt to the new way of doing stuff. I like that proposal. Sure, 9.0 
> > >> >> will remove lots of deprecated code, but I think it is a mistake to 
> > >> >> do all of the proposed removals at once. We can spread removals out 
> > >> >> in 9.x releases, after users have had a few releases with a choice 
> > >> >> between old and new and the new alternative is solid.
> > >> >>
> > >> >> Thanks Gus for taking ownership and suggesting a process! Feel free 
> > >> >> to rework what I edited into a structure you see more fit.
> > >> >>
> > >> >> Jan
> > >> >>
> > >> >> 11. aug. 2020 kl. 18:51 skrev Gus Heck :
> > >> >>
> > >> >> I was thinking that level of detail is in the Jira... I don't see any 
> > >> >> reason for things to disappear (in fact rejected should go in a 
> > >> >> rejected list for future reference.)
> > >> >>
> > >> >> On Tue, Aug 11, 2020 at 12:04 PM Ilan Ginzburg  
> > >> >> wrote:
> > >> >>>
> > >> >>> Maybe also add “in progress”? So items do not disappear suddenly 
> > >> >>> from the page when work really starts on them?
> > >> >>>
> > >> >>> On Tue 11 Aug 2020 at 17:15, Gus Heck  wrote:
> > >> >>>>
> > >> >>>> Cool, since I brought it up, I can vo

Re: RoadMap?

2020-08-27 Thread Erick Erickson
; It has been proposed on the list to NOT rip out all deprecations in 
> >> >> 9.0, but allow users to upgrade to 9.0 with e.g. SolrCell still 
> >> >> available, and then have yet some time to change their processes to 
> >> >> adapt to the new way of doing stuff. I like that proposal. Sure, 9.0 
> >> >> will remove lots of deprecated code, but I think it is a mistake to do 
> >> >> all of the proposed removals at once. We can spread removals out in 9.x 
> >> >> releases, after users have had a few releases with a choice between old 
> >> >> and new and the new alternative is solid.
> >> >
> >> >
> >> > I find it highly depressing that we can't, *in a major release*, manage 
> >> > to get rid of our deprecations -- particularly for code that has a new 
> >> > home and is packaged in a form that is trivial to install (thanks to our 
> >> > new awesome package manager).  I'm sympathetic to waiting to delete 
> >> > until *after* there is an actual package ready at that time (rather than 
> >> > just the promise of one).
> >> >
> >> > Also, users generally are cautious on performing a major version 
> >> > upgrade.  There's time.
> >> >
> >> > ~ David Smiley
> >> > Apache Lucene/Solr Search Developer
> >> > http://www.linkedin.com/in/davidwsmiley
> >> >
> >> >
> >> > On Wed, Aug 12, 2020 at 4:06 AM Jan Høydahl  
> >> > wrote:
> >> >>
> >> >> I edited the page to introduce the (super important) Solr TLP split 
> >> >> into the roadmap.
> >> >> Also added a rough timeframe and a «major theme» for each release above 
> >> >> the issue table.
> >> >> I added 8.8 and 9.1 as I think it is important to track what gets done 
> >> >> just before 9.0 and what can be deferred to after 9.0.
> >> >>
> >> >> It has been proposed on the list to NOT rip out all deprecations in 
> >> >> 9.0, but allow users to upgrade to 9.0 with e.g. SolrCell still 
> >> >> available, and then have yet some time to change their processes to 
> >> >> adapt to the new way of doing stuff. I like that proposal. Sure, 9.0 
> >> >> will remove lots of deprecated code, but I think it is a mistake to do 
> >> >> all of the proposed removals at once. We can spread removals out in 9.x 
> >> >> releases, after users have had a few releases with a choice between old 
> >> >> and new and the new alternative is solid.
> >> >>
> >> >> Thanks Gus for taking ownership and suggesting a process! Feel free to 
> >> >> rework what I edited into a structure you see more fit.
> >> >>
> >> >> Jan
> >> >>
> >> >> 11. aug. 2020 kl. 18:51 skrev Gus Heck :
> >> >>
> >> >> I was thinking that level of detail is in the Jira... I don't see any 
> >> >> reason for things to disappear (in fact rejected should go in a 
> >> >> rejected list for future reference.)
> >> >>
> >> >> On Tue, Aug 11, 2020 at 12:04 PM Ilan Ginzburg  
> >> >> wrote:
> >> >>>
> >> >>> Maybe also add “in progress”? So items do not disappear suddenly from 
> >> >>> the page when work really starts on them?
> >> >>>
> >> >>> On Tue 11 Aug 2020 at 17:15, Gus Heck  wrote:
> >> >>>>
> >> >>>> Cool, since I brought it up, I can volunteer to help manage the page. 
> >> >>>> We should get jira issue links in there wherever possible. Do we want 
> >> >>>> to build an initial list and have some sort of Proposed/Planned 
> >> >>>> workflow so readers can have confidence (or appropriate lack of 
> >> >>>> confidence) in what they see there? voting on things seems like too 
> >> >>>> much but maybe folks who care watch the page, and if something is on 
> >> >>>> there for a week without objection it can be called accepted? If a 
> >> >>>> discussion starts here it can be marked "Considering" so... something 
> >> >>>> like this:
> >> >>>>
> >> >>>> 4 states: Proposed, Considering, Planned, Rejected
> >> >>>>
> >> >>>> Workflow like this:
> >> >

Re: RoadMap?

2020-08-27 Thread Ishan Chattopadhyaya
It does start. It is broken because it is fraught with dangers of users
shooting themselves in their feet. Some context here:
https://issues.apache.org/jira/browse/SOLR-14616?focusedCommentId=17153129=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17153129

On Fri, Aug 28, 2020 at 4:52 AM Alexandre Rafalovitch 
wrote:

> If CDCR is actively broken (does not start?), then isn't it
> effectively deprecated from the last version that did not work? And if
> it is not going to be maintained, then isn't the 'latest' version is
> whichever we still did not delete it in. Because a broken feature is
> only worth keeping in, if we ever plan to fix it.
>
> We have been through the same with UIMA, if I recall. It was broken
> for a bit and then when I pulled it, ONE person got all upset.
> SOLR-11694
>
> Regards,
>Alex
> Ps. I don't know the degree of 'broken' of this specific feature. So,
> I am mostly talking practical principles here.
>
> On Thu, 27 Aug 2020 at 19:03, Ishan Chattopadhyaya
>  wrote:
> >
> > > I find it highly depressing that we can't, *in a major release*,
> manage to get rid of our deprecations -- particularly for code that has a
> new home and is packaged in a form that is trivial to install (thanks to
> our new awesome package manager).
> >
> > I'm not sure why you think "we can't". I can't even remember a single
> committer standing in the way of removing those *that already have a
> package*. However, there's a backlash against removing CDCR even though
> there is no one volunteering to support it (as a package) and it is clearly
> broken, which is what totally puzzles me.
> https://issues.apache.org/jira/browse/SOLR-14616
> >
> > On Fri, Aug 28, 2020 at 4:19 AM Alexandre Rafalovitch <
> arafa...@gmail.com> wrote:
> >>
> >> Well, I have created SOLR-14783 (Remove DIH from 9.0) and am busily
> >> learning magic gradle commands to make that happen without leaving
> >> behind random crumbs.  Once that lands, I will do Jira search on all
> >> DIH still-open tasks after that and close them pointing to the said
> >> Jira.
> >>
> >> So, I guess somebody better -1 the Jira if they really want that one
> >> to stay until ... ? And then read very carefully through SIP-10 of
> >> which, this is just a first step.
> >>
> >> In general, maybe we can manage to do so many new features and cleanup
> >> in 9 that will make Solr TLP look like a great Big Bang moment...
> >>
> >> And it will probably take a little longer to achieve that, so the -
> >> effective - deprecation schedule would still be ok.
> >>
> >> Regards,
> >>Alex.
> >>
> >> On Thu, 27 Aug 2020 at 18:35, David Smiley  wrote:
> >> >>
> >> >> It has been proposed on the list to NOT rip out all deprecations in
> 9.0, but allow users to upgrade to 9.0 with e.g. SolrCell still available,
> and then have yet some time to change their processes to adapt to the new
> way of doing stuff. I like that proposal. Sure, 9.0 will remove lots of
> deprecated code, but I think it is a mistake to do all of the proposed
> removals at once. We can spread removals out in 9.x releases, after users
> have had a few releases with a choice between old and new and the new
> alternative is solid.
> >> >
> >> >
> >> > I find it highly depressing that we can't, *in a major release*,
> manage to get rid of our deprecations -- particularly for code that has a
> new home and is packaged in a form that is trivial to install (thanks to
> our new awesome package manager).  I'm sympathetic to waiting to delete
> until *after* there is an actual package ready at that time (rather than
> just the promise of one).
> >> >
> >> > Also, users generally are cautious on performing a major version
> upgrade.  There's time.
> >> >
> >> > ~ David Smiley
> >> > Apache Lucene/Solr Search Developer
> >> > http://www.linkedin.com/in/davidwsmiley
> >> >
> >> >
> >> > On Wed, Aug 12, 2020 at 4:06 AM Jan Høydahl 
> wrote:
> >> >>
> >> >> I edited the page to introduce the (super important) Solr TLP split
> into the roadmap.
> >> >> Also added a rough timeframe and a «major theme» for each release
> above the issue table.
> >> >> I added 8.8 and 9.1 as I think it is important to track what gets
> done just before 9.0 and what can be deferred to after 9.0.
> >> >>
> >> >> It has been proposed on the list to NOT rip out all deprecations in

Re: RoadMap?

2020-08-27 Thread Alexandre Rafalovitch
If CDCR is actively broken (does not start?), then isn't it
effectively deprecated from the last version that did not work? And if
it is not going to be maintained, then isn't the 'latest' version is
whichever we still did not delete it in. Because a broken feature is
only worth keeping in, if we ever plan to fix it.

We have been through the same with UIMA, if I recall. It was broken
for a bit and then when I pulled it, ONE person got all upset.
SOLR-11694

Regards,
   Alex
Ps. I don't know the degree of 'broken' of this specific feature. So,
I am mostly talking practical principles here.

On Thu, 27 Aug 2020 at 19:03, Ishan Chattopadhyaya
 wrote:
>
> > I find it highly depressing that we can't, *in a major release*, manage to 
> > get rid of our deprecations -- particularly for code that has a new home 
> > and is packaged in a form that is trivial to install (thanks to our new 
> > awesome package manager).
>
> I'm not sure why you think "we can't". I can't even remember a single 
> committer standing in the way of removing those *that already have a 
> package*. However, there's a backlash against removing CDCR even though there 
> is no one volunteering to support it (as a package) and it is clearly broken, 
> which is what totally puzzles me. 
> https://issues.apache.org/jira/browse/SOLR-14616
>
> On Fri, Aug 28, 2020 at 4:19 AM Alexandre Rafalovitch  
> wrote:
>>
>> Well, I have created SOLR-14783 (Remove DIH from 9.0) and am busily
>> learning magic gradle commands to make that happen without leaving
>> behind random crumbs.  Once that lands, I will do Jira search on all
>> DIH still-open tasks after that and close them pointing to the said
>> Jira.
>>
>> So, I guess somebody better -1 the Jira if they really want that one
>> to stay until ... ? And then read very carefully through SIP-10 of
>> which, this is just a first step.
>>
>> In general, maybe we can manage to do so many new features and cleanup
>> in 9 that will make Solr TLP look like a great Big Bang moment...
>>
>> And it will probably take a little longer to achieve that, so the -
>> effective - deprecation schedule would still be ok.
>>
>> Regards,
>>Alex.
>>
>> On Thu, 27 Aug 2020 at 18:35, David Smiley  wrote:
>> >>
>> >> It has been proposed on the list to NOT rip out all deprecations in 9.0, 
>> >> but allow users to upgrade to 9.0 with e.g. SolrCell still available, and 
>> >> then have yet some time to change their processes to adapt to the new way 
>> >> of doing stuff. I like that proposal. Sure, 9.0 will remove lots of 
>> >> deprecated code, but I think it is a mistake to do all of the proposed 
>> >> removals at once. We can spread removals out in 9.x releases, after users 
>> >> have had a few releases with a choice between old and new and the new 
>> >> alternative is solid.
>> >
>> >
>> > I find it highly depressing that we can't, *in a major release*, manage to 
>> > get rid of our deprecations -- particularly for code that has a new home 
>> > and is packaged in a form that is trivial to install (thanks to our new 
>> > awesome package manager).  I'm sympathetic to waiting to delete until 
>> > *after* there is an actual package ready at that time (rather than just 
>> > the promise of one).
>> >
>> > Also, users generally are cautious on performing a major version upgrade.  
>> > There's time.
>> >
>> > ~ David Smiley
>> > Apache Lucene/Solr Search Developer
>> > http://www.linkedin.com/in/davidwsmiley
>> >
>> >
>> > On Wed, Aug 12, 2020 at 4:06 AM Jan Høydahl  wrote:
>> >>
>> >> I edited the page to introduce the (super important) Solr TLP split into 
>> >> the roadmap.
>> >> Also added a rough timeframe and a «major theme» for each release above 
>> >> the issue table.
>> >> I added 8.8 and 9.1 as I think it is important to track what gets done 
>> >> just before 9.0 and what can be deferred to after 9.0.
>> >>
>> >> It has been proposed on the list to NOT rip out all deprecations in 9.0, 
>> >> but allow users to upgrade to 9.0 with e.g. SolrCell still available, and 
>> >> then have yet some time to change their processes to adapt to the new way 
>> >> of doing stuff. I like that proposal. Sure, 9.0 will remove lots of 
>> >> deprecated code, but I think it is a mistake to do all of the proposed 
>> >> removals at once. We can spread removals out in 9.x releases, after users 
>>

Re: RoadMap?

2020-08-27 Thread Ishan Chattopadhyaya
> I find it highly depressing that we can't, *in a major release*, manage
to get rid of our deprecations -- particularly for code that has a new home
and is packaged in a form that is trivial to install (thanks to our new
awesome package manager).

I'm not sure why you think "we can't". I can't even remember a single
committer standing in the way of removing those *that already have a
package*. However, there's a backlash against removing CDCR even though
there is no one volunteering to support it (as a package) and it is clearly
broken, which is what totally puzzles me.
https://issues.apache.org/jira/browse/SOLR-14616

On Fri, Aug 28, 2020 at 4:19 AM Alexandre Rafalovitch 
wrote:

> Well, I have created SOLR-14783 (Remove DIH from 9.0) and am busily
> learning magic gradle commands to make that happen without leaving
> behind random crumbs.  Once that lands, I will do Jira search on all
> DIH still-open tasks after that and close them pointing to the said
> Jira.
>
> So, I guess somebody better -1 the Jira if they really want that one
> to stay until ... ? And then read very carefully through SIP-10 of
> which, this is just a first step.
>
> In general, maybe we can manage to do so many new features and cleanup
> in 9 that will make Solr TLP look like a great Big Bang moment...
>
> And it will probably take a little longer to achieve that, so the -
> effective - deprecation schedule would still be ok.
>
> Regards,
>Alex.
>
> On Thu, 27 Aug 2020 at 18:35, David Smiley  wrote:
> >>
> >> It has been proposed on the list to NOT rip out all deprecations in
> 9.0, but allow users to upgrade to 9.0 with e.g. SolrCell still available,
> and then have yet some time to change their processes to adapt to the new
> way of doing stuff. I like that proposal. Sure, 9.0 will remove lots of
> deprecated code, but I think it is a mistake to do all of the proposed
> removals at once. We can spread removals out in 9.x releases, after users
> have had a few releases with a choice between old and new and the new
> alternative is solid.
> >
> >
> > I find it highly depressing that we can't, *in a major release*, manage
> to get rid of our deprecations -- particularly for code that has a new home
> and is packaged in a form that is trivial to install (thanks to our new
> awesome package manager).  I'm sympathetic to waiting to delete until
> *after* there is an actual package ready at that time (rather than just the
> promise of one).
> >
> > Also, users generally are cautious on performing a major version
> upgrade.  There's time.
> >
> > ~ David Smiley
> > Apache Lucene/Solr Search Developer
> > http://www.linkedin.com/in/davidwsmiley
> >
> >
> > On Wed, Aug 12, 2020 at 4:06 AM Jan Høydahl 
> wrote:
> >>
> >> I edited the page to introduce the (super important) Solr TLP split
> into the roadmap.
> >> Also added a rough timeframe and a «major theme» for each release above
> the issue table.
> >> I added 8.8 and 9.1 as I think it is important to track what gets done
> just before 9.0 and what can be deferred to after 9.0.
> >>
> >> It has been proposed on the list to NOT rip out all deprecations in
> 9.0, but allow users to upgrade to 9.0 with e.g. SolrCell still available,
> and then have yet some time to change their processes to adapt to the new
> way of doing stuff. I like that proposal. Sure, 9.0 will remove lots of
> deprecated code, but I think it is a mistake to do all of the proposed
> removals at once. We can spread removals out in 9.x releases, after users
> have had a few releases with a choice between old and new and the new
> alternative is solid.
> >>
> >> Thanks Gus for taking ownership and suggesting a process! Feel free to
> rework what I edited into a structure you see more fit.
> >>
> >> Jan
> >>
> >> 11. aug. 2020 kl. 18:51 skrev Gus Heck :
> >>
> >> I was thinking that level of detail is in the Jira... I don't see any
> reason for things to disappear (in fact rejected should go in a rejected
> list for future reference.)
> >>
> >> On Tue, Aug 11, 2020 at 12:04 PM Ilan Ginzburg 
> wrote:
> >>>
> >>> Maybe also add “in progress”? So items do not disappear suddenly from
> the page when work really starts on them?
> >>>
> >>> On Tue 11 Aug 2020 at 17:15, Gus Heck  wrote:
> >>>>
> >>>> Cool, since I brought it up, I can volunteer to help manage the page.
> We should get jira issue links in there wherever possible. Do we want to
> build an initial list and have some sort of Proposed/Planned workflow so
> readers can have 

Re: RoadMap?

2020-08-27 Thread Alexandre Rafalovitch
Well, I have created SOLR-14783 (Remove DIH from 9.0) and am busily
learning magic gradle commands to make that happen without leaving
behind random crumbs.  Once that lands, I will do Jira search on all
DIH still-open tasks after that and close them pointing to the said
Jira.

So, I guess somebody better -1 the Jira if they really want that one
to stay until ... ? And then read very carefully through SIP-10 of
which, this is just a first step.

In general, maybe we can manage to do so many new features and cleanup
in 9 that will make Solr TLP look like a great Big Bang moment...

And it will probably take a little longer to achieve that, so the -
effective - deprecation schedule would still be ok.

Regards,
   Alex.

On Thu, 27 Aug 2020 at 18:35, David Smiley  wrote:
>>
>> It has been proposed on the list to NOT rip out all deprecations in 9.0, but 
>> allow users to upgrade to 9.0 with e.g. SolrCell still available, and then 
>> have yet some time to change their processes to adapt to the new way of 
>> doing stuff. I like that proposal. Sure, 9.0 will remove lots of deprecated 
>> code, but I think it is a mistake to do all of the proposed removals at 
>> once. We can spread removals out in 9.x releases, after users have had a few 
>> releases with a choice between old and new and the new alternative is solid.
>
>
> I find it highly depressing that we can't, *in a major release*, manage to 
> get rid of our deprecations -- particularly for code that has a new home and 
> is packaged in a form that is trivial to install (thanks to our new awesome 
> package manager).  I'm sympathetic to waiting to delete until *after* there 
> is an actual package ready at that time (rather than just the promise of one).
>
> Also, users generally are cautious on performing a major version upgrade.  
> There's time.
>
> ~ David Smiley
> Apache Lucene/Solr Search Developer
> http://www.linkedin.com/in/davidwsmiley
>
>
> On Wed, Aug 12, 2020 at 4:06 AM Jan Høydahl  wrote:
>>
>> I edited the page to introduce the (super important) Solr TLP split into the 
>> roadmap.
>> Also added a rough timeframe and a «major theme» for each release above the 
>> issue table.
>> I added 8.8 and 9.1 as I think it is important to track what gets done just 
>> before 9.0 and what can be deferred to after 9.0.
>>
>> It has been proposed on the list to NOT rip out all deprecations in 9.0, but 
>> allow users to upgrade to 9.0 with e.g. SolrCell still available, and then 
>> have yet some time to change their processes to adapt to the new way of 
>> doing stuff. I like that proposal. Sure, 9.0 will remove lots of deprecated 
>> code, but I think it is a mistake to do all of the proposed removals at 
>> once. We can spread removals out in 9.x releases, after users have had a few 
>> releases with a choice between old and new and the new alternative is solid.
>>
>> Thanks Gus for taking ownership and suggesting a process! Feel free to 
>> rework what I edited into a structure you see more fit.
>>
>> Jan
>>
>> 11. aug. 2020 kl. 18:51 skrev Gus Heck :
>>
>> I was thinking that level of detail is in the Jira... I don't see any reason 
>> for things to disappear (in fact rejected should go in a rejected list for 
>> future reference.)
>>
>> On Tue, Aug 11, 2020 at 12:04 PM Ilan Ginzburg  wrote:
>>>
>>> Maybe also add “in progress”? So items do not disappear suddenly from the 
>>> page when work really starts on them?
>>>
>>> On Tue 11 Aug 2020 at 17:15, Gus Heck  wrote:
>>>>
>>>> Cool, since I brought it up, I can volunteer to help manage the page. We 
>>>> should get jira issue links in there wherever possible. Do we want to 
>>>> build an initial list and have some sort of Proposed/Planned workflow so 
>>>> readers can have confidence (or appropriate lack of confidence) in what 
>>>> they see there? voting on things seems like too much but maybe folks who 
>>>> care watch the page, and if something is on there for a week without 
>>>> objection it can be called accepted? If a discussion starts here it can be 
>>>> marked "Considering" so... something like this:
>>>>
>>>> 4 states: Proposed, Considering, Planned, Rejected
>>>>
>>>> Workflow like this:
>>>> Proposed ---(no objection 1 wk) --> Planned
>>>> Proposed ---(discussion)--> Considering
>>>> Considering (agreement) --> Planned
>>>> Considering (deferred) ---> Proposed (later release)
>>>> Considering (unsui

Re: RoadMap?

2020-08-27 Thread Erick Erickson
IMO, it’s super-awkward to preserve something for 9.0 then remove it for 9.1. I 
understand the tendency to say “deprecating it in 8.7 then removing it in the 
next release is too fast”, but that strikes me as even worse than taking it out 
of 9.0.

> On Aug 27, 2020, at 6:35 PM, David Smiley  wrote:
> 
> It has been proposed on the list to NOT rip out all deprecations in 9.0, but 
> allow users to upgrade to 9.0 with e.g. SolrCell still available, and then 
> have yet some time to change their processes to adapt to the new way of doing 
> stuff. I like that proposal. Sure, 9.0 will remove lots of deprecated code, 
> but I think it is a mistake to do all of the proposed removals at once. We 
> can spread removals out in 9.x releases, after users have had a few releases 
> with a choice between old and new and the new alternative is solid.
> 
> I find it highly depressing that we can't, *in a major release*, manage to 
> get rid of our deprecations -- particularly for code that has a new home and 
> is packaged in a form that is trivial to install (thanks to our new awesome 
> package manager).  I'm sympathetic to waiting to delete until *after* there 
> is an actual package ready at that time (rather than just the promise of one).
> 
> Also, users generally are cautious on performing a major version upgrade.  
> There's time.
> 
> ~ David Smiley
> Apache Lucene/Solr Search Developer
> http://www.linkedin.com/in/davidwsmiley
> 
> 
> On Wed, Aug 12, 2020 at 4:06 AM Jan Høydahl  wrote:
> I edited the page to introduce the (super important) Solr TLP split into the 
> roadmap.
> Also added a rough timeframe and a «major theme» for each release above the 
> issue table.
> I added 8.8 and 9.1 as I think it is important to track what gets done just 
> before 9.0 and what can be deferred to after 9.0.
> 
> It has been proposed on the list to NOT rip out all deprecations in 9.0, but 
> allow users to upgrade to 9.0 with e.g. SolrCell still available, and then 
> have yet some time to change their processes to adapt to the new way of doing 
> stuff. I like that proposal. Sure, 9.0 will remove lots of deprecated code, 
> but I think it is a mistake to do all of the proposed removals at once. We 
> can spread removals out in 9.x releases, after users have had a few releases 
> with a choice between old and new and the new alternative is solid.
> 
> Thanks Gus for taking ownership and suggesting a process! Feel free to rework 
> what I edited into a structure you see more fit.
> 
> Jan
> 
>> 11. aug. 2020 kl. 18:51 skrev Gus Heck :
>> 
>> I was thinking that level of detail is in the Jira... I don't see any reason 
>> for things to disappear (in fact rejected should go in a rejected list for 
>> future reference.)
>> 
>> On Tue, Aug 11, 2020 at 12:04 PM Ilan Ginzburg  wrote:
>> Maybe also add “in progress”? So items do not disappear suddenly from the 
>> page when work really starts on them?
>> 
>> On Tue 11 Aug 2020 at 17:15, Gus Heck  wrote:
>> Cool, since I brought it up, I can volunteer to help manage the page. We 
>> should get jira issue links in there wherever possible. Do we want to build 
>> an initial list and have some sort of Proposed/Planned workflow so readers 
>> can have confidence (or appropriate lack of confidence) in what they see 
>> there? voting on things seems like too much but maybe folks who care watch 
>> the page, and if something is on there for a week without objection it can 
>> be called accepted? If a discussion starts here it can be marked 
>> "Considering" so... something like this:
>> 
>> 4 states: Proposed, Considering, Planned, Rejected
>> 
>> Workflow like this:
>> Proposed ---(no objection 1 wk) --> Planned 
>> Proposed ---(discussion)--> Considering
>> Considering (agreement) --> Planned
>> Considering (deferred) ---> Proposed (later release)
>> Considering (unsuitable) -> Rejected
>> Considering (promoted) ---> Proposed (earlier release)
>> Planned (difficulty found) ---> Considering
>> 
>> Anything in "Considering" should have an active dev list thread, and if it 
>> didn't happen on the list it didn't happen :). Any of that (or differences 
>> of opinion during Considering) can be overridden by a formal vote of course
>> 
>> -Gus
>> 
>> 
>> 
>> 
>> On Tue, Aug 11, 2020 at 10:29 AM Ishan Chattopadhyaya 
>>  wrote:
>> I've created a placeholder document here: 
>> https://cwiki.apache.org/confluence/display/SOLR/Roadmap
>> Let us put in all our items there.
>&

Re: RoadMap?

2020-08-27 Thread David Smiley
>
> It has been proposed on the list to NOT rip out all deprecations in 9.0,
> but allow users to upgrade to 9.0 with e.g. SolrCell still available, and
> then have yet some time to change their processes to adapt to the new way
> of doing stuff. I like that proposal. Sure, 9.0 will remove lots of
> deprecated code, but I think it is a mistake to do all of the proposed
> removals at once. We can spread removals out in 9.x releases, after users
> have had a few releases with a choice between old and new and the new
> alternative is solid.
>

I find it highly depressing that we can't, *in a major release*, manage to
get rid of our deprecations -- particularly for code that has a new home
and is packaged in a form that is trivial to install (thanks to our new
awesome package manager).  I'm sympathetic to waiting to delete until
*after* there is an actual package ready at that time (rather than just the
promise of one).

Also, users generally are cautious on performing a major version upgrade.
There's time.

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


On Wed, Aug 12, 2020 at 4:06 AM Jan Høydahl  wrote:

> I edited the page to introduce the (super important) Solr TLP split into
> the roadmap.
> Also added a rough timeframe and a «major theme» for each release above
> the issue table.
> I added 8.8 and 9.1 as I think it is important to track what gets done
> just before 9.0 and what can be deferred to after 9.0.
>
> It has been proposed on the list to NOT rip out all deprecations in 9.0,
> but allow users to upgrade to 9.0 with e.g. SolrCell still available, and
> then have yet some time to change their processes to adapt to the new way
> of doing stuff. I like that proposal. Sure, 9.0 will remove lots of
> deprecated code, but I think it is a mistake to do all of the proposed
> removals at once. We can spread removals out in 9.x releases, after users
> have had a few releases with a choice between old and new and the new
> alternative is solid.
>
> Thanks Gus for taking ownership and suggesting a process! Feel free to
> rework what I edited into a structure you see more fit.
>
> Jan
>
> 11. aug. 2020 kl. 18:51 skrev Gus Heck :
>
> I was thinking that level of detail is in the Jira... I don't see any
> reason for things to disappear (in fact rejected should go in a rejected
> list for future reference.)
>
> On Tue, Aug 11, 2020 at 12:04 PM Ilan Ginzburg  wrote:
>
>> Maybe also add “in progress”? So items do not disappear suddenly from the
>> page when work really starts on them?
>>
>> On Tue 11 Aug 2020 at 17:15, Gus Heck  wrote:
>>
>>> Cool, since I brought it up, I can volunteer to help manage the page. We
>>> should get jira issue links in there wherever possible. Do we want to build
>>> an initial list and have some sort of Proposed/Planned workflow so readers
>>> can have confidence (or appropriate lack of confidence) in what they see
>>> there? voting on things seems like too much but maybe folks who care watch
>>> the page, and if something is on there for a week without objection it can
>>> be called accepted? If a discussion starts here it can be marked
>>> "Considering" so... something like this:
>>>
>>> 4 states: Proposed, Considering, Planned, Rejected
>>>
>>> Workflow like this:
>>> Proposed ---(no objection 1 wk) --> Planned
>>> Proposed ---(discussion)--> Considering
>>> Considering (agreement) --> Planned
>>> Considering (deferred) ---> Proposed (later release)
>>> Considering (unsuitable) -> Rejected
>>> Considering (promoted) ---> Proposed (earlier release)
>>> Planned (difficulty found) ---> Considering
>>>
>>> Anything in "Considering" should have an active dev list thread, and if
>>> it didn't happen on the list it didn't happen :). Any of that (or
>>> differences of opinion during Considering) can be overridden by a formal
>>> vote of course
>>>
>>> -Gus
>>>
>>>
>>>
>>>
>>> On Tue, Aug 11, 2020 at 10:29 AM Ishan Chattopadhyaya <
>>> ichattopadhy...@gmail.com> wrote:
>>>
>>>> I've created a placeholder document here:
>>>> https://cwiki.apache.org/confluence/display/SOLR/Roadmap
>>>> Let us put in all our items there.
>>>>
>>>> On Tue, Aug 11, 2020 at 4:45 PM Jan Høydahl 
>>>> wrote:
>>>>
>>>>> Let’s revive this email thread about Roadmap.
>>>>>
>>>>>
>>>

Re: RoadMap?

2020-08-12 Thread Jason Gerlowski
I agree with the approach Jan voiced - that at least some of these
features should appear in 9.0 as deprecated to give users more time.

That said, maybe discussing what to do around these features should be
its own thread or should be taken back to some of the specific jiras
where there's already been some discussion (e.g. SOLR-14616).  It just
seems likely to hijack this thread away from other discussions (how to
manage/handle our new Roadmap page).

On Wed, Aug 12, 2020 at 9:35 AM Ishan Chattopadhyaya
 wrote:
>
> > It has been proposed on the list to NOT rip out all deprecations in 9.0, 
> > but allow users to
> > upgrade to 9.0 with e.g. SolrCell still available, and then have yet some 
> > time to change their
> > processes to adapt to the new way of doing stuff. I like that proposal. 
> > Sure, 9.0 will remove lots
> > of deprecated code, but I think it is a mistake to do all of the proposed 
> > removals at once. We
> > can spread removals out in 9.x releases, after users have had a few 
> > releases with a choice between
> > old and new and the new alternative is solid.
>
> I support the DIH, autoscaling and CDCR going away in 9.0, rest of the things 
> can just move into first party packages and continue to be part of the 
> distribution. Does that make sense, Jan?
>
> On Wed, Aug 12, 2020 at 1:36 PM Jan Høydahl  wrote:
>>
>> I edited the page to introduce the (super important) Solr TLP split into the 
>> roadmap.
>> Also added a rough timeframe and a «major theme» for each release above the 
>> issue table.
>> I added 8.8 and 9.1 as I think it is important to track what gets done just 
>> before 9.0 and what can be deferred to after 9.0.
>>
>> It has been proposed on the list to NOT rip out all deprecations in 9.0, but 
>> allow users to upgrade to 9.0 with e.g. SolrCell still available, and then 
>> have yet some time to change their processes to adapt to the new way of 
>> doing stuff. I like that proposal. Sure, 9.0 will remove lots of deprecated 
>> code, but I think it is a mistake to do all of the proposed removals at 
>> once. We can spread removals out in 9.x releases, after users have had a few 
>> releases with a choice between old and new and the new alternative is solid.
>>
>> Thanks Gus for taking ownership and suggesting a process! Feel free to 
>> rework what I edited into a structure you see more fit.
>>
>> Jan
>>
>> 11. aug. 2020 kl. 18:51 skrev Gus Heck :
>>
>> I was thinking that level of detail is in the Jira... I don't see any reason 
>> for things to disappear (in fact rejected should go in a rejected list for 
>> future reference.)
>>
>> On Tue, Aug 11, 2020 at 12:04 PM Ilan Ginzburg  wrote:
>>>
>>> Maybe also add “in progress”? So items do not disappear suddenly from the 
>>> page when work really starts on them?
>>>
>>> On Tue 11 Aug 2020 at 17:15, Gus Heck  wrote:
>>>>
>>>> Cool, since I brought it up, I can volunteer to help manage the page. We 
>>>> should get jira issue links in there wherever possible. Do we want to 
>>>> build an initial list and have some sort of Proposed/Planned workflow so 
>>>> readers can have confidence (or appropriate lack of confidence) in what 
>>>> they see there? voting on things seems like too much but maybe folks who 
>>>> care watch the page, and if something is on there for a week without 
>>>> objection it can be called accepted? If a discussion starts here it can be 
>>>> marked "Considering" so... something like this:
>>>>
>>>> 4 states: Proposed, Considering, Planned, Rejected
>>>>
>>>> Workflow like this:
>>>> Proposed ---(no objection 1 wk) --> Planned
>>>> Proposed ---(discussion)--> Considering
>>>> Considering (agreement) --> Planned
>>>> Considering (deferred) ---> Proposed (later release)
>>>> Considering (unsuitable) -> Rejected
>>>> Considering (promoted) ---> Proposed (earlier release)
>>>> Planned (difficulty found) ---> Considering
>>>>
>>>> Anything in "Considering" should have an active dev list thread, and if it 
>>>> didn't happen on the list it didn't happen :). Any of that (or differences 
>>>> of opinion during Considering) can be overridden by a formal vote of course
>>>>
>>>> -Gus
>>>>
>>>>
>>>>
>>>>
>>>> On Tue, Aug 11, 2020 a

Re: RoadMap?

2020-08-12 Thread Ishan Chattopadhyaya
> It has been proposed on the list to NOT rip out all deprecations in 9.0,
but allow users to
> upgrade to 9.0 with e.g. SolrCell still available, and then have yet some
time to change their
> processes to adapt to the new way of doing stuff. I like that proposal.
Sure, 9.0 will remove lots
> of deprecated code, but I think it is a mistake to do all of the proposed
removals at once. We
> can spread removals out in 9.x releases, after users have had a few
releases with a choice between
> old and new and the new alternative is solid.

I support the DIH, autoscaling and CDCR going away in 9.0, rest of the
things can just move into first party packages and continue to be part of
the distribution. Does that make sense, Jan?

On Wed, Aug 12, 2020 at 1:36 PM Jan Høydahl  wrote:

> I edited the page to introduce the (super important) Solr TLP split into
> the roadmap.
> Also added a rough timeframe and a «major theme» for each release above
> the issue table.
> I added 8.8 and 9.1 as I think it is important to track what gets done
> just before 9.0 and what can be deferred to after 9.0.
>
> It has been proposed on the list to NOT rip out all deprecations in 9.0,
> but allow users to upgrade to 9.0 with e.g. SolrCell still available, and
> then have yet some time to change their processes to adapt to the new way
> of doing stuff. I like that proposal. Sure, 9.0 will remove lots of
> deprecated code, but I think it is a mistake to do all of the proposed
> removals at once. We can spread removals out in 9.x releases, after users
> have had a few releases with a choice between old and new and the new
> alternative is solid.
>
> Thanks Gus for taking ownership and suggesting a process! Feel free to
> rework what I edited into a structure you see more fit.
>
> Jan
>
> 11. aug. 2020 kl. 18:51 skrev Gus Heck :
>
> I was thinking that level of detail is in the Jira... I don't see any
> reason for things to disappear (in fact rejected should go in a rejected
> list for future reference.)
>
> On Tue, Aug 11, 2020 at 12:04 PM Ilan Ginzburg  wrote:
>
>> Maybe also add “in progress”? So items do not disappear suddenly from the
>> page when work really starts on them?
>>
>> On Tue 11 Aug 2020 at 17:15, Gus Heck  wrote:
>>
>>> Cool, since I brought it up, I can volunteer to help manage the page. We
>>> should get jira issue links in there wherever possible. Do we want to build
>>> an initial list and have some sort of Proposed/Planned workflow so readers
>>> can have confidence (or appropriate lack of confidence) in what they see
>>> there? voting on things seems like too much but maybe folks who care watch
>>> the page, and if something is on there for a week without objection it can
>>> be called accepted? If a discussion starts here it can be marked
>>> "Considering" so... something like this:
>>>
>>> 4 states: Proposed, Considering, Planned, Rejected
>>>
>>> Workflow like this:
>>> Proposed ---(no objection 1 wk) --> Planned
>>> Proposed ---(discussion)--> Considering
>>> Considering (agreement) --> Planned
>>> Considering (deferred) ---> Proposed (later release)
>>> Considering (unsuitable) -> Rejected
>>> Considering (promoted) ---> Proposed (earlier release)
>>> Planned (difficulty found) ---> Considering
>>>
>>> Anything in "Considering" should have an active dev list thread, and if
>>> it didn't happen on the list it didn't happen :). Any of that (or
>>> differences of opinion during Considering) can be overridden by a formal
>>> vote of course
>>>
>>> -Gus
>>>
>>>
>>>
>>>
>>> On Tue, Aug 11, 2020 at 10:29 AM Ishan Chattopadhyaya <
>>> ichattopadhy...@gmail.com> wrote:
>>>
>>>> I've created a placeholder document here:
>>>> https://cwiki.apache.org/confluence/display/SOLR/Roadmap
>>>> Let us put in all our items there.
>>>>
>>>> On Tue, Aug 11, 2020 at 4:45 PM Jan Høydahl 
>>>> wrote:
>>>>
>>>>> Let’s revive this email thread about Roadmap.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> With so many large initiatives going on, and the TLP split also, I
>>>>> think it makes perfect sense with a Roadmap.
>>>>>
>>>>>
>>>>> I know we’re not used to that kind of thing - we tend to just let
>>>>> things play out as it happens to land in 

Re: RoadMap?

2020-08-12 Thread Jan Høydahl
I edited the page to introduce the (super important) Solr TLP split into the 
roadmap.
Also added a rough timeframe and a «major theme» for each release above the 
issue table.
I added 8.8 and 9.1 as I think it is important to track what gets done just 
before 9.0 and what can be deferred to after 9.0.

It has been proposed on the list to NOT rip out all deprecations in 9.0, but 
allow users to upgrade to 9.0 with e.g. SolrCell still available, and then have 
yet some time to change their processes to adapt to the new way of doing stuff. 
I like that proposal. Sure, 9.0 will remove lots of deprecated code, but I 
think it is a mistake to do all of the proposed removals at once. We can spread 
removals out in 9.x releases, after users have had a few releases with a choice 
between old and new and the new alternative is solid.

Thanks Gus for taking ownership and suggesting a process! Feel free to rework 
what I edited into a structure you see more fit.

Jan

> 11. aug. 2020 kl. 18:51 skrev Gus Heck :
> 
> I was thinking that level of detail is in the Jira... I don't see any reason 
> for things to disappear (in fact rejected should go in a rejected list for 
> future reference.)
> 
> On Tue, Aug 11, 2020 at 12:04 PM Ilan Ginzburg  <mailto:ilans...@gmail.com>> wrote:
> Maybe also add “in progress”? So items do not disappear suddenly from the 
> page when work really starts on them?
> 
> On Tue 11 Aug 2020 at 17:15, Gus Heck  <mailto:gus.h...@gmail.com>> wrote:
> Cool, since I brought it up, I can volunteer to help manage the page. We 
> should get jira issue links in there wherever possible. Do we want to build 
> an initial list and have some sort of Proposed/Planned workflow so readers 
> can have confidence (or appropriate lack of confidence) in what they see 
> there? voting on things seems like too much but maybe folks who care watch 
> the page, and if something is on there for a week without objection it can be 
> called accepted? If a discussion starts here it can be marked "Considering" 
> so... something like this:
> 
> 4 states: Proposed, Considering, Planned, Rejected
> 
> Workflow like this:
> Proposed ---(no objection 1 wk) --> Planned 
> Proposed ---(discussion)--> Considering
> Considering (agreement) --> Planned
> Considering (deferred) ---> Proposed (later release)
> Considering (unsuitable) -> Rejected
> Considering (promoted) ---> Proposed (earlier release)
> Planned (difficulty found) ---> Considering
> 
> Anything in "Considering" should have an active dev list thread, and if it 
> didn't happen on the list it didn't happen :). Any of that (or differences of 
> opinion during Considering) can be overridden by a formal vote of course
> 
> -Gus
> 
> 
> 
> 
> On Tue, Aug 11, 2020 at 10:29 AM Ishan Chattopadhyaya 
> mailto:ichattopadhy...@gmail.com>> wrote:
> I've created a placeholder document here: 
> https://cwiki.apache.org/confluence/display/SOLR/Roadmap 
> <https://cwiki.apache.org/confluence/display/SOLR/Roadmap>
> Let us put in all our items there.
> 
> On Tue, Aug 11, 2020 at 4:45 PM Jan Høydahl  <mailto:jan@cominvent.com>> wrote:
> Let’s revive this email thread about Roadmap.
> 
> 
> 
> 
> 
> With so many large initiatives going on, and the TLP split also, I think it 
> makes perfect sense with a Roadmap.
> 
> 
> I know we’re not used to that kind of thing - we tend to just let things play 
> out as it happens to land in various releases, but this time is special, and 
> I think we’d benefit from more coordination. I don’t know how to enforce such 
> coordination though, other than appealing to all committers to endorse the 
> roadmap and respect it when they merge things. We may not be able to set a 
> release date for 9.0 right now, but we may be able to define preconditions 
> and scope certain features to 9.0 or 9.1 rather than 8.7 or 8.8 - that kind 
> of coarse-grained decisions. We also may need a person that «owns» the 
> Roadmap confluence page and actively promotes it, tries to keep it up to date 
> and reminds the rest of us about its existence. A roadmap must NOT be a brake 
> slowing us down, but a tool helping us avoid silly mistakes.
> 
> 
> 
> 
> 
> Jan
> 
> 
> 
> 
> 
> > 5. jul. 2020 kl. 02:39 skrev Noble Paul  > <mailto:noble.p...@gmail.com>>:
> 
> 
> > 
> 
> 
> > I think the logical thing to do today is completely rip out all
> 
> 
> > autoscaling code as it exists today.
> 
> 
> > Let's deprecate that in 8.7 and build something for "assign-strategy".
> 
> 
> > Austoscaling , if 

Re: RoadMap?

2020-08-11 Thread Marcus Eagan
+1 (non-binding)

On Tue, Aug 11, 2020 at 09:52 Gus Heck  wrote:

> I was thinking that level of detail is in the Jira... I don't see any
> reason for things to disappear (in fact rejected should go in a rejected
> list for future reference.)
>
> On Tue, Aug 11, 2020 at 12:04 PM Ilan Ginzburg  wrote:
>
>> Maybe also add “in progress”? So items do not disappear suddenly from the
>> page when work really starts on them?
>>
>> On Tue 11 Aug 2020 at 17:15, Gus Heck  wrote:
>>
>>> Cool, since I brought it up, I can volunteer to help manage the page. We
>>> should get jira issue links in there wherever possible. Do we want to build
>>> an initial list and have some sort of Proposed/Planned workflow so readers
>>> can have confidence (or appropriate lack of confidence) in what they see
>>> there? voting on things seems like too much but maybe folks who care watch
>>> the page, and if something is on there for a week without objection it can
>>> be called accepted? If a discussion starts here it can be marked
>>> "Considering" so... something like this:
>>>
>>> 4 states: Proposed, Considering, Planned, Rejected
>>>
>>> Workflow like this:
>>> Proposed ---(no objection 1 wk) --> Planned
>>> Proposed ---(discussion)--> Considering
>>> Considering (agreement) --> Planned
>>> Considering (deferred) ---> Proposed (later release)
>>> Considering (unsuitable) -> Rejected
>>> Considering (promoted) ---> Proposed (earlier release)
>>> Planned (difficulty found) ---> Considering
>>>
>>> Anything in "Considering" should have an active dev list thread, and if
>>> it didn't happen on the list it didn't happen :). Any of that (or
>>> differences of opinion during Considering) can be overridden by a formal
>>> vote of course
>>>
>>> -Gus
>>>
>>>
>>>
>>>
>>> On Tue, Aug 11, 2020 at 10:29 AM Ishan Chattopadhyaya <
>>> ichattopadhy...@gmail.com> wrote:
>>>
>>>> I've created a placeholder document here:
>>>> https://cwiki.apache.org/confluence/display/SOLR/Roadmap
>>>> Let us put in all our items there.
>>>>
>>>> On Tue, Aug 11, 2020 at 4:45 PM Jan Høydahl 
>>>> wrote:
>>>>
>>>>> Let’s revive this email thread about Roadmap.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> With so many large initiatives going on, and the TLP split also, I
>>>>> think it makes perfect sense with a Roadmap.
>>>>>
>>>>>
>>>>> I know we’re not used to that kind of thing - we tend to just let
>>>>> things play out as it happens to land in various releases, but this time 
>>>>> is
>>>>> special, and I think we’d benefit from more coordination. I don’t know how
>>>>> to enforce such coordination though, other than appealing to all 
>>>>> committers
>>>>> to endorse the roadmap and respect it when they merge things. We may not 
>>>>> be
>>>>> able to set a release date for 9.0 right now, but we may be able to define
>>>>> preconditions and scope certain features to 9.0 or 9.1 rather than 8.7 or
>>>>> 8.8 - that kind of coarse-grained decisions. We also may need a person 
>>>>> that
>>>>> «owns» the Roadmap confluence page and actively promotes it, tries to keep
>>>>> it up to date and reminds the rest of us about its existence. A roadmap
>>>>> must NOT be a brake slowing us down, but a tool helping us avoid silly
>>>>> mistakes.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> Jan
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> > 5. jul. 2020 kl. 02:39 skrev Noble Paul :
>>>>>
>>>>>
>>>>> >
>>>>>
>>>>>
>>>>> > I think the logical thing to do today is completely rip out all
>>>>>
>>>>>
>>>>> > autoscaling code as it exists today.
>>>>>
>>>>>
>>>>> > Let's deprecate that in 8.7 and build something for
>>>>> "assign-strategy".
>>>>>
>>>>>
>>>>

Re: RoadMap?

2020-08-11 Thread Gus Heck
I was thinking that level of detail is in the Jira... I don't see any
reason for things to disappear (in fact rejected should go in a rejected
list for future reference.)

On Tue, Aug 11, 2020 at 12:04 PM Ilan Ginzburg  wrote:

> Maybe also add “in progress”? So items do not disappear suddenly from the
> page when work really starts on them?
>
> On Tue 11 Aug 2020 at 17:15, Gus Heck  wrote:
>
>> Cool, since I brought it up, I can volunteer to help manage the page. We
>> should get jira issue links in there wherever possible. Do we want to build
>> an initial list and have some sort of Proposed/Planned workflow so readers
>> can have confidence (or appropriate lack of confidence) in what they see
>> there? voting on things seems like too much but maybe folks who care watch
>> the page, and if something is on there for a week without objection it can
>> be called accepted? If a discussion starts here it can be marked
>> "Considering" so... something like this:
>>
>> 4 states: Proposed, Considering, Planned, Rejected
>>
>> Workflow like this:
>> Proposed ---(no objection 1 wk) --> Planned
>> Proposed ---(discussion)--> Considering
>> Considering (agreement) --> Planned
>> Considering (deferred) ---> Proposed (later release)
>> Considering (unsuitable) -> Rejected
>> Considering (promoted) ---> Proposed (earlier release)
>> Planned (difficulty found) ---> Considering
>>
>> Anything in "Considering" should have an active dev list thread, and if
>> it didn't happen on the list it didn't happen :). Any of that (or
>> differences of opinion during Considering) can be overridden by a formal
>> vote of course
>>
>> -Gus
>>
>>
>>
>>
>> On Tue, Aug 11, 2020 at 10:29 AM Ishan Chattopadhyaya <
>> ichattopadhy...@gmail.com> wrote:
>>
>>> I've created a placeholder document here:
>>> https://cwiki.apache.org/confluence/display/SOLR/Roadmap
>>> Let us put in all our items there.
>>>
>>> On Tue, Aug 11, 2020 at 4:45 PM Jan Høydahl 
>>> wrote:
>>>
>>>> Let’s revive this email thread about Roadmap.
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> With so many large initiatives going on, and the TLP split also, I
>>>> think it makes perfect sense with a Roadmap.
>>>>
>>>>
>>>> I know we’re not used to that kind of thing - we tend to just let
>>>> things play out as it happens to land in various releases, but this time is
>>>> special, and I think we’d benefit from more coordination. I don’t know how
>>>> to enforce such coordination though, other than appealing to all committers
>>>> to endorse the roadmap and respect it when they merge things. We may not be
>>>> able to set a release date for 9.0 right now, but we may be able to define
>>>> preconditions and scope certain features to 9.0 or 9.1 rather than 8.7 or
>>>> 8.8 - that kind of coarse-grained decisions. We also may need a person that
>>>> «owns» the Roadmap confluence page and actively promotes it, tries to keep
>>>> it up to date and reminds the rest of us about its existence. A roadmap
>>>> must NOT be a brake slowing us down, but a tool helping us avoid silly
>>>> mistakes.
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> Jan
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> > 5. jul. 2020 kl. 02:39 skrev Noble Paul :
>>>>
>>>>
>>>> >
>>>>
>>>>
>>>> > I think the logical thing to do today is completely rip out all
>>>>
>>>>
>>>> > autoscaling code as it exists today.
>>>>
>>>>
>>>> > Let's deprecate that in 8.7 and build something for "assign-strategy".
>>>>
>>>>
>>>> > Austoscaling , if required, should not be a part of Solr
>>>>
>>>>
>>>> >
>>>>
>>>>
>>>> >
>>>>
>>>>
>>>> >
>>>>
>>>>
>>>> > On Fri, Jul 3, 2020 at 5:48 PM Jan Høydahl 
>>>> wrote:
>>>>
>>>>
>>>> >>
>>>>
>>>>
>>>> >> +1
>>>>
>>>>
>>>> >>
>>&g

Re: RoadMap?

2020-08-11 Thread Ilan Ginzburg
Maybe also add “in progress”? So items do not disappear suddenly from the
page when work really starts on them?

On Tue 11 Aug 2020 at 17:15, Gus Heck  wrote:

> Cool, since I brought it up, I can volunteer to help manage the page. We
> should get jira issue links in there wherever possible. Do we want to build
> an initial list and have some sort of Proposed/Planned workflow so readers
> can have confidence (or appropriate lack of confidence) in what they see
> there? voting on things seems like too much but maybe folks who care watch
> the page, and if something is on there for a week without objection it can
> be called accepted? If a discussion starts here it can be marked
> "Considering" so... something like this:
>
> 4 states: Proposed, Considering, Planned, Rejected
>
> Workflow like this:
> Proposed ---(no objection 1 wk) --> Planned
> Proposed ---(discussion)--> Considering
> Considering (agreement) --> Planned
> Considering (deferred) ---> Proposed (later release)
> Considering (unsuitable) -> Rejected
> Considering (promoted) ---> Proposed (earlier release)
> Planned (difficulty found) ---> Considering
>
> Anything in "Considering" should have an active dev list thread, and if it
> didn't happen on the list it didn't happen :). Any of that (or differences
> of opinion during Considering) can be overridden by a formal vote of course
>
> -Gus
>
>
>
>
> On Tue, Aug 11, 2020 at 10:29 AM Ishan Chattopadhyaya <
> ichattopadhy...@gmail.com> wrote:
>
>> I've created a placeholder document here:
>> https://cwiki.apache.org/confluence/display/SOLR/Roadmap
>> Let us put in all our items there.
>>
>> On Tue, Aug 11, 2020 at 4:45 PM Jan Høydahl 
>> wrote:
>>
>>> Let’s revive this email thread about Roadmap.
>>>
>>>
>>>
>>>
>>>
>>> With so many large initiatives going on, and the TLP split also, I think
>>> it makes perfect sense with a Roadmap.
>>>
>>>
>>> I know we’re not used to that kind of thing - we tend to just let things
>>> play out as it happens to land in various releases, but this time is
>>> special, and I think we’d benefit from more coordination. I don’t know how
>>> to enforce such coordination though, other than appealing to all committers
>>> to endorse the roadmap and respect it when they merge things. We may not be
>>> able to set a release date for 9.0 right now, but we may be able to define
>>> preconditions and scope certain features to 9.0 or 9.1 rather than 8.7 or
>>> 8.8 - that kind of coarse-grained decisions. We also may need a person that
>>> «owns» the Roadmap confluence page and actively promotes it, tries to keep
>>> it up to date and reminds the rest of us about its existence. A roadmap
>>> must NOT be a brake slowing us down, but a tool helping us avoid silly
>>> mistakes.
>>>
>>>
>>>
>>>
>>>
>>> Jan
>>>
>>>
>>>
>>>
>>>
>>> > 5. jul. 2020 kl. 02:39 skrev Noble Paul :
>>>
>>>
>>> >
>>>
>>>
>>> > I think the logical thing to do today is completely rip out all
>>>
>>>
>>> > autoscaling code as it exists today.
>>>
>>>
>>> > Let's deprecate that in 8.7 and build something for "assign-strategy".
>>>
>>>
>>> > Austoscaling , if required, should not be a part of Solr
>>>
>>>
>>> >
>>>
>>>
>>> >
>>>
>>>
>>> >
>>>
>>>
>>> > On Fri, Jul 3, 2020 at 5:48 PM Jan Høydahl 
>>> wrote:
>>>
>>>
>>> >>
>>>
>>>
>>> >> +1
>>>
>>>
>>> >>
>>>
>>>
>>> >> Why don’t we make a Roadmap wiki page as Cassandra suggests, and
>>> indicate what major things needs to happen when.
>>>
>>>
>>> >> Perhaps if we can get the Solr TLP and git-split ball rolling as a
>>> pre-9.0 task, then perhaps 8.8 could be the last joint release (6.6, 7.7,
>>> 8.8 hehe)?
>>>
>>>
>>> >> That would enable Lucene to ship 9.0 without waiting for a ton of
>>> alpha-quality Solr features, and Solr could have its own Roadmap wiki.
>>>
>>>
>>> >>
>>>
>>>
>>> >> Jan
>>>
>>>
>>>

Re: RoadMap?

2020-08-11 Thread Gus Heck
Cool, since I brought it up, I can volunteer to help manage the page. We
should get jira issue links in there wherever possible. Do we want to build
an initial list and have some sort of Proposed/Planned workflow so readers
can have confidence (or appropriate lack of confidence) in what they see
there? voting on things seems like too much but maybe folks who care watch
the page, and if something is on there for a week without objection it can
be called accepted? If a discussion starts here it can be marked
"Considering" so... something like this:

4 states: Proposed, Considering, Planned, Rejected

Workflow like this:
Proposed ---(no objection 1 wk) --> Planned
Proposed ---(discussion)--> Considering
Considering (agreement) --> Planned
Considering (deferred) ---> Proposed (later release)
Considering (unsuitable) -> Rejected
Considering (promoted) ---> Proposed (earlier release)
Planned (difficulty found) ---> Considering

Anything in "Considering" should have an active dev list thread, and if it
didn't happen on the list it didn't happen :). Any of that (or differences
of opinion during Considering) can be overridden by a formal vote of course

-Gus




On Tue, Aug 11, 2020 at 10:29 AM Ishan Chattopadhyaya <
ichattopadhy...@gmail.com> wrote:

> I've created a placeholder document here:
> https://cwiki.apache.org/confluence/display/SOLR/Roadmap
> Let us put in all our items there.
>
> On Tue, Aug 11, 2020 at 4:45 PM Jan Høydahl  wrote:
>
>> Let’s revive this email thread about Roadmap.
>>
>> With so many large initiatives going on, and the TLP split also, I think
>> it makes perfect sense with a Roadmap.
>> I know we’re not used to that kind of thing - we tend to just let things
>> play out as it happens to land in various releases, but this time is
>> special, and I think we’d benefit from more coordination. I don’t know how
>> to enforce such coordination though, other than appealing to all committers
>> to endorse the roadmap and respect it when they merge things. We may not be
>> able to set a release date for 9.0 right now, but we may be able to define
>> preconditions and scope certain features to 9.0 or 9.1 rather than 8.7 or
>> 8.8 - that kind of coarse-grained decisions. We also may need a person that
>> «owns» the Roadmap confluence page and actively promotes it, tries to keep
>> it up to date and reminds the rest of us about its existence. A roadmap
>> must NOT be a brake slowing us down, but a tool helping us avoid silly
>> mistakes.
>>
>> Jan
>>
>> > 5. jul. 2020 kl. 02:39 skrev Noble Paul :
>> >
>> > I think the logical thing to do today is completely rip out all
>> > autoscaling code as it exists today.
>> > Let's deprecate that in 8.7 and build something for "assign-strategy".
>> > Austoscaling , if required, should not be a part of Solr
>> >
>> >
>> >
>> > On Fri, Jul 3, 2020 at 5:48 PM Jan Høydahl 
>> wrote:
>> >>
>> >> +1
>> >>
>> >> Why don’t we make a Roadmap wiki page as Cassandra suggests, and
>> indicate what major things needs to happen when.
>> >> Perhaps if we can get the Solr TLP and git-split ball rolling as a
>> pre-9.0 task, then perhaps 8.8 could be the last joint release (6.6, 7.7,
>> 8.8 hehe)?
>> >> That would enable Lucene to ship 9.0 without waiting for a ton of
>> alpha-quality Solr features, and Solr could have its own Roadmap wiki.
>> >>
>> >> Jan
>> >>
>> >> 3. jul. 2020 kl. 09:19 skrev Dawid Weiss :
>> >>
>> >>
>> >>> I totally expect some things to bubble up when we try to release with
>> Gradle, the tarball being one. I don’t think that’s a very big issue, but
>> if you have lots of “not very big” issues they do add up.
>> >>
>> >>
>> >> Adding a tarball is literally 3-5 lines of code (you add a task that
>> builds a tarball or a zip file from the outputs of solr/packaging toDir
>> task)... The bigger issue with gradle is that somebody has to step up and
>> try to identify any other issues and/or missing bits when trying to do a
>> full release cycle.
>> >>
>> >> D.
>> >>
>> >>
>> >
>> >
>> > --
>> > -
>> > Noble Paul
>> >
>> > -
>> > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>> > For additional commands, e-mail: dev-h...@lucene.apache.org
>> >
>>
>>
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: dev-h...@lucene.apache.org
>>
>>

-- 
http://www.needhamsoftware.com (work)
http://www.the111shift.com (play)


Re: RoadMap?

2020-08-11 Thread Ishan Chattopadhyaya
I've created a placeholder document here:
https://cwiki.apache.org/confluence/display/SOLR/Roadmap
Let us put in all our items there.

On Tue, Aug 11, 2020 at 4:45 PM Jan Høydahl  wrote:

> Let’s revive this email thread about Roadmap.
>
> With so many large initiatives going on, and the TLP split also, I think
> it makes perfect sense with a Roadmap.
> I know we’re not used to that kind of thing - we tend to just let things
> play out as it happens to land in various releases, but this time is
> special, and I think we’d benefit from more coordination. I don’t know how
> to enforce such coordination though, other than appealing to all committers
> to endorse the roadmap and respect it when they merge things. We may not be
> able to set a release date for 9.0 right now, but we may be able to define
> preconditions and scope certain features to 9.0 or 9.1 rather than 8.7 or
> 8.8 - that kind of coarse-grained decisions. We also may need a person that
> «owns» the Roadmap confluence page and actively promotes it, tries to keep
> it up to date and reminds the rest of us about its existence. A roadmap
> must NOT be a brake slowing us down, but a tool helping us avoid silly
> mistakes.
>
> Jan
>
> > 5. jul. 2020 kl. 02:39 skrev Noble Paul :
> >
> > I think the logical thing to do today is completely rip out all
> > autoscaling code as it exists today.
> > Let's deprecate that in 8.7 and build something for "assign-strategy".
> > Austoscaling , if required, should not be a part of Solr
> >
> >
> >
> > On Fri, Jul 3, 2020 at 5:48 PM Jan Høydahl 
> wrote:
> >>
> >> +1
> >>
> >> Why don’t we make a Roadmap wiki page as Cassandra suggests, and
> indicate what major things needs to happen when.
> >> Perhaps if we can get the Solr TLP and git-split ball rolling as a
> pre-9.0 task, then perhaps 8.8 could be the last joint release (6.6, 7.7,
> 8.8 hehe)?
> >> That would enable Lucene to ship 9.0 without waiting for a ton of
> alpha-quality Solr features, and Solr could have its own Roadmap wiki.
> >>
> >> Jan
> >>
> >> 3. jul. 2020 kl. 09:19 skrev Dawid Weiss :
> >>
> >>
> >>> I totally expect some things to bubble up when we try to release with
> Gradle, the tarball being one. I don’t think that’s a very big issue, but
> if you have lots of “not very big” issues they do add up.
> >>
> >>
> >> Adding a tarball is literally 3-5 lines of code (you add a task that
> builds a tarball or a zip file from the outputs of solr/packaging toDir
> task)... The bigger issue with gradle is that somebody has to step up and
> try to identify any other issues and/or missing bits when trying to do a
> full release cycle.
> >>
> >> D.
> >>
> >>
> >
> >
> > --
> > -
> > Noble Paul
> >
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> > For additional commands, e-mail: dev-h...@lucene.apache.org
> >
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>


Re: RoadMap?

2020-08-11 Thread Jan Høydahl
Let’s revive this email thread about Roadmap.

With so many large initiatives going on, and the TLP split also, I think it 
makes perfect sense with a Roadmap.
I know we’re not used to that kind of thing - we tend to just let things play 
out as it happens to land in various releases, but this time is special, and I 
think we’d benefit from more coordination. I don’t know how to enforce such 
coordination though, other than appealing to all committers to endorse the 
roadmap and respect it when they merge things. We may not be able to set a 
release date for 9.0 right now, but we may be able to define preconditions and 
scope certain features to 9.0 or 9.1 rather than 8.7 or 8.8 - that kind of 
coarse-grained decisions. We also may need a person that «owns» the Roadmap 
confluence page and actively promotes it, tries to keep it up to date and 
reminds the rest of us about its existence. A roadmap must NOT be a brake 
slowing us down, but a tool helping us avoid silly mistakes.

Jan

> 5. jul. 2020 kl. 02:39 skrev Noble Paul :
> 
> I think the logical thing to do today is completely rip out all
> autoscaling code as it exists today.
> Let's deprecate that in 8.7 and build something for "assign-strategy".
> Austoscaling , if required, should not be a part of Solr
> 
> 
> 
> On Fri, Jul 3, 2020 at 5:48 PM Jan Høydahl  wrote:
>> 
>> +1
>> 
>> Why don’t we make a Roadmap wiki page as Cassandra suggests, and indicate 
>> what major things needs to happen when.
>> Perhaps if we can get the Solr TLP and git-split ball rolling as a pre-9.0 
>> task, then perhaps 8.8 could be the last joint release (6.6, 7.7, 8.8 hehe)?
>> That would enable Lucene to ship 9.0 without waiting for a ton of 
>> alpha-quality Solr features, and Solr could have its own Roadmap wiki.
>> 
>> Jan
>> 
>> 3. jul. 2020 kl. 09:19 skrev Dawid Weiss :
>> 
>> 
>>> I totally expect some things to bubble up when we try to release with 
>>> Gradle, the tarball being one. I don’t think that’s a very big issue, but 
>>> if you have lots of “not very big” issues they do add up.
>> 
>> 
>> Adding a tarball is literally 3-5 lines of code (you add a task that builds 
>> a tarball or a zip file from the outputs of solr/packaging toDir task)... 
>> The bigger issue with gradle is that somebody has to step up and try to 
>> identify any other issues and/or missing bits when trying to do a full 
>> release cycle.
>> 
>> D.
>> 
>> 
> 
> 
> -- 
> -
> Noble Paul
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
> 


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: RoadMap?

2020-07-04 Thread Noble Paul
I think the logical thing to do today is completely rip out all
autoscaling code as it exists today.
Let's deprecate that in 8.7 and build something for "assign-strategy".
Austoscaling , if required, should not be a part of Solr



On Fri, Jul 3, 2020 at 5:48 PM Jan Høydahl  wrote:
>
> +1
>
> Why don’t we make a Roadmap wiki page as Cassandra suggests, and indicate 
> what major things needs to happen when.
> Perhaps if we can get the Solr TLP and git-split ball rolling as a pre-9.0 
> task, then perhaps 8.8 could be the last joint release (6.6, 7.7, 8.8 hehe)?
> That would enable Lucene to ship 9.0 without waiting for a ton of 
> alpha-quality Solr features, and Solr could have its own Roadmap wiki.
>
> Jan
>
> 3. jul. 2020 kl. 09:19 skrev Dawid Weiss :
>
>
>> I totally expect some things to bubble up when we try to release with 
>> Gradle, the tarball being one. I don’t think that’s a very big issue, but if 
>> you have lots of “not very big” issues they do add up.
>
>
> Adding a tarball is literally 3-5 lines of code (you add a task that builds a 
> tarball or a zip file from the outputs of solr/packaging toDir task)... The 
> bigger issue with gradle is that somebody has to step up and try to identify 
> any other issues and/or missing bits when trying to do a full release cycle.
>
> D.
>
>


-- 
-
Noble Paul

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: RoadMap?

2020-07-03 Thread Jan Høydahl
+1

Why don’t we make a Roadmap wiki page as Cassandra suggests, and indicate what 
major things needs to happen when.
Perhaps if we can get the Solr TLP and git-split ball rolling as a pre-9.0 
task, then perhaps 8.8 could be the last joint release (6.6, 7.7, 8.8 hehe)?
That would enable Lucene to ship 9.0 without waiting for a ton of alpha-quality 
Solr features, and Solr could have its own Roadmap wiki.

Jan

> 3. jul. 2020 kl. 09:19 skrev Dawid Weiss :
> 
> 
> I totally expect some things to bubble up when we try to release with Gradle, 
> the tarball being one. I don’t think that’s a very big issue, but if you have 
> lots of “not very big” issues they do add up.
> 
> Adding a tarball is literally 3-5 lines of code (you add a task that builds a 
> tarball or a zip file from the outputs of solr/packaging toDir task)... The 
> bigger issue with gradle is that somebody has to step up and try to identify 
> any other issues and/or missing bits when trying to do a full release cycle. 
> 
> D. 



Re: RoadMap?

2020-07-03 Thread Dawid Weiss
> I totally expect some things to bubble up when we try to release with
> Gradle, the tarball being one. I don’t think that’s a very big issue, but
> if you have lots of “not very big” issues they do add up.
>

Adding a tarball is literally 3-5 lines of code (you add a task that builds
a tarball or a zip file from the outputs of solr/packaging toDir task)...
The bigger issue with gradle is that somebody has to step up and try to
identify any other issues and/or missing bits when trying to do a full
release cycle.

D.


Re: RoadMap?

2020-07-02 Thread Ishan Chattopadhyaya
> Autoscaling is another big item, but I think we have to put it into
> 9x, it’s a critical (and critically broken) functionality

I think autoscaling has many aspects to it. The only piece that I find
critical to Solr (SolrCloud / core) is reasonable (not super smart) replica
placement. Rest of all autoscaling functionality (and replica placement as
well) should now be built as pluggable components, and preferably as
packages.

On Fri, Jul 3, 2020 at 5:43 AM Ishan Chattopadhyaya <
ichattopadhy...@gmail.com> wrote:

> > With 9x having java 11 and gradle migrations on the dev side, and about
> to have
> > a significant round of deprecations/removals and migrations to plugin
> for things
> > such as CDCR, DIH etc (see
> https://issues.apache.org/jira/browse/SOLR-13442
> > and https://issues.apache.org/jira/browse/SOLR-14022) some of which
> may(?)
> > need a replacement (i.e. CDCR?) or ways ot easily re-enable (i.e.
> streaming)
> > before 9x is able to be released. Plus there's been talk of a revamped
> UI...
>
> I think we should avoid large code drop-ins at all costs. No new CDCR or
> UI code should be part of Solr codebase. Any big, new feature should be
> built as packages.
>
> On Fri, Jul 3, 2020 at 1:49 AM David Smiley  wrote:
>
>> No more JIRA fields please; "Fix Version" is adequate.  You can edit an
>> issue after creating it to set the "Fix version" to "master (9.0)"; the
>> issue doesn't have to be resolved yet.  I recently had ASF change JIRA so
>> that this field is not editable on creation of the issue because our
>> contributors don't know any better than to put something inappropriate
>> there.  But the edit screen works.
>>
>> I think Lucene might release v9 without Solr if Solr is going to drag its
>> feet for too long.  Wearing my Lucene hat (er... well shirt), I support
>> that within reason.
>>
>> Agreed on Confluence.
>>
>> ~ David Smiley
>> Apache Lucene/Solr Search Developer
>> http://www.linkedin.com/in/davidwsmiley
>>
>>
>> On Thu, Jul 2, 2020 at 2:30 PM Gus Heck  wrote:
>>
>>> Looking at
>>> https://issues.apache.org/jira/projects/SOLR?selectedItem=com.atlassian.jira.jira-projects-plugin%3Arelease-page=unreleased
>>>  maybe
>>> this is as simple as having one more field on our issues... Currently fix
>>> version denotes when something got fixed, perhaps a "target version" field
>>> could indicate when we want to fix it by. Then we just need a tag in jira
>>> and perhaps a branch.
>>>
>>> Alternately (or maybe additionally) we could make a "board" if that's
>>> easier to monitor.
>>>
>>> On Thu, Jul 2, 2020 at 2:02 PM Gus Heck  wrote:
>>>
 Jira typically has features for designating what's in a release

 On Thu, Jul 2, 2020 at 1:55 PM Erick Erickson 
 wrote:

> I totally expect some things to bubble up when we try to release with
> Gradle, the tarball being one. I don’t think that’s a very big issue, but
> if you have lots of “not very big” issues they do add up.
>
> That said, yeah, I do think it’s time to start getting a handle on
> 9.0. Pulling Ant out of the build system is another possibility.
>
> Solr as a TLP? Or is that Solr 10? Or does it even have to be as of a
> major release?
>
> Sound like a Wiki page or some such to me…
>
> Erick
>
> > On Jul 2, 2020, at 1:07 PM, Andrzej Białecki  wrote:
> >
> > Autoscaling is another big item, but I think we have to put it into
> 9x, it’s a critical (and critically broken) functionality. We’re making
> some progress with Ilan and Noble so I’m cautiously optimistic.
> >
> >> On 2 Jul 2020, at 18:58, Gus Heck  wrote:
> >>
> >> Should we have one?
> >>
> >> With 9x having java 11 and gradle migrations on the dev side, and
> about to have a significant round of deprecations/removals and migrations
> to plugin for things such as CDCR, DIH etc (see
> https://issues.apache.org/jira/browse/SOLR-13442 and
> https://issues.apache.org/jira/browse/SOLR-14022) some of which
> may(?) need a replacement (i.e. CDCR?) or ways ot easily re-enable (i.e.
> streaming) before 9x is able to be released. Plus there's been talk of a
> revamped UI...
> >>
> >> I'm worried that there is a danger that 9x will continue to diverge
> and pick up major changes, but always have something big in progress and
> never be able to release.
> >>
> >> Perhaps we should attempt to put a box around the things that need
> to happen for 9x, and begin targeting any larger projects that come up at
> 10x? Among other things the gradle work probably can't be complete until
> someone has gone through a release using it. (I don't think we build the
> tarballs in gradle yet for example, unless that got added recently)
> >>
> >> -Gus
> >>
> >> --
> >> http://www.needhamsoftware.com (work)
> >> http://www.the111shift.com (play)
> >
>
>
> 

Re: RoadMap?

2020-07-02 Thread Ishan Chattopadhyaya
> With 9x having java 11 and gradle migrations on the dev side, and about
to have
> a significant round of deprecations/removals and migrations to plugin for
things
> such as CDCR, DIH etc (see
https://issues.apache.org/jira/browse/SOLR-13442
> and https://issues.apache.org/jira/browse/SOLR-14022) some of which
may(?)
> need a replacement (i.e. CDCR?) or ways ot easily re-enable (i.e.
streaming)
> before 9x is able to be released. Plus there's been talk of a revamped
UI...

I think we should avoid large code drop-ins at all costs. No new CDCR or UI
code should be part of Solr codebase. Any big, new feature should be built
as packages.

On Fri, Jul 3, 2020 at 1:49 AM David Smiley  wrote:

> No more JIRA fields please; "Fix Version" is adequate.  You can edit an
> issue after creating it to set the "Fix version" to "master (9.0)"; the
> issue doesn't have to be resolved yet.  I recently had ASF change JIRA so
> that this field is not editable on creation of the issue because our
> contributors don't know any better than to put something inappropriate
> there.  But the edit screen works.
>
> I think Lucene might release v9 without Solr if Solr is going to drag its
> feet for too long.  Wearing my Lucene hat (er... well shirt), I support
> that within reason.
>
> Agreed on Confluence.
>
> ~ David Smiley
> Apache Lucene/Solr Search Developer
> http://www.linkedin.com/in/davidwsmiley
>
>
> On Thu, Jul 2, 2020 at 2:30 PM Gus Heck  wrote:
>
>> Looking at
>> https://issues.apache.org/jira/projects/SOLR?selectedItem=com.atlassian.jira.jira-projects-plugin%3Arelease-page=unreleased
>>  maybe
>> this is as simple as having one more field on our issues... Currently fix
>> version denotes when something got fixed, perhaps a "target version" field
>> could indicate when we want to fix it by. Then we just need a tag in jira
>> and perhaps a branch.
>>
>> Alternately (or maybe additionally) we could make a "board" if that's
>> easier to monitor.
>>
>> On Thu, Jul 2, 2020 at 2:02 PM Gus Heck  wrote:
>>
>>> Jira typically has features for designating what's in a release
>>>
>>> On Thu, Jul 2, 2020 at 1:55 PM Erick Erickson 
>>> wrote:
>>>
 I totally expect some things to bubble up when we try to release with
 Gradle, the tarball being one. I don’t think that’s a very big issue, but
 if you have lots of “not very big” issues they do add up.

 That said, yeah, I do think it’s time to start getting a handle on 9.0.
 Pulling Ant out of the build system is another possibility.

 Solr as a TLP? Or is that Solr 10? Or does it even have to be as of a
 major release?

 Sound like a Wiki page or some such to me…

 Erick

 > On Jul 2, 2020, at 1:07 PM, Andrzej Białecki  wrote:
 >
 > Autoscaling is another big item, but I think we have to put it into
 9x, it’s a critical (and critically broken) functionality. We’re making
 some progress with Ilan and Noble so I’m cautiously optimistic.
 >
 >> On 2 Jul 2020, at 18:58, Gus Heck  wrote:
 >>
 >> Should we have one?
 >>
 >> With 9x having java 11 and gradle migrations on the dev side, and
 about to have a significant round of deprecations/removals and migrations
 to plugin for things such as CDCR, DIH etc (see
 https://issues.apache.org/jira/browse/SOLR-13442 and
 https://issues.apache.org/jira/browse/SOLR-14022) some of which may(?)
 need a replacement (i.e. CDCR?) or ways ot easily re-enable (i.e.
 streaming) before 9x is able to be released. Plus there's been talk of a
 revamped UI...
 >>
 >> I'm worried that there is a danger that 9x will continue to diverge
 and pick up major changes, but always have something big in progress and
 never be able to release.
 >>
 >> Perhaps we should attempt to put a box around the things that need
 to happen for 9x, and begin targeting any larger projects that come up at
 10x? Among other things the gradle work probably can't be complete until
 someone has gone through a release using it. (I don't think we build the
 tarballs in gradle yet for example, unless that got added recently)
 >>
 >> -Gus
 >>
 >> --
 >> http://www.needhamsoftware.com (work)
 >> http://www.the111shift.com (play)
 >


 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org


>>>
>>> --
>>> http://www.needhamsoftware.com (work)
>>> http://www.the111shift.com (play)
>>>
>>
>>
>> --
>> http://www.needhamsoftware.com (work)
>> http://www.the111shift.com (play)
>>
>


Re: RoadMap?

2020-07-02 Thread David Smiley
No more JIRA fields please; "Fix Version" is adequate.  You can edit an
issue after creating it to set the "Fix version" to "master (9.0)"; the
issue doesn't have to be resolved yet.  I recently had ASF change JIRA so
that this field is not editable on creation of the issue because our
contributors don't know any better than to put something inappropriate
there.  But the edit screen works.

I think Lucene might release v9 without Solr if Solr is going to drag its
feet for too long.  Wearing my Lucene hat (er... well shirt), I support
that within reason.

Agreed on Confluence.

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


On Thu, Jul 2, 2020 at 2:30 PM Gus Heck  wrote:

> Looking at
> https://issues.apache.org/jira/projects/SOLR?selectedItem=com.atlassian.jira.jira-projects-plugin%3Arelease-page=unreleased
>  maybe
> this is as simple as having one more field on our issues... Currently fix
> version denotes when something got fixed, perhaps a "target version" field
> could indicate when we want to fix it by. Then we just need a tag in jira
> and perhaps a branch.
>
> Alternately (or maybe additionally) we could make a "board" if that's
> easier to monitor.
>
> On Thu, Jul 2, 2020 at 2:02 PM Gus Heck  wrote:
>
>> Jira typically has features for designating what's in a release
>>
>> On Thu, Jul 2, 2020 at 1:55 PM Erick Erickson 
>> wrote:
>>
>>> I totally expect some things to bubble up when we try to release with
>>> Gradle, the tarball being one. I don’t think that’s a very big issue, but
>>> if you have lots of “not very big” issues they do add up.
>>>
>>> That said, yeah, I do think it’s time to start getting a handle on 9.0.
>>> Pulling Ant out of the build system is another possibility.
>>>
>>> Solr as a TLP? Or is that Solr 10? Or does it even have to be as of a
>>> major release?
>>>
>>> Sound like a Wiki page or some such to me…
>>>
>>> Erick
>>>
>>> > On Jul 2, 2020, at 1:07 PM, Andrzej Białecki  wrote:
>>> >
>>> > Autoscaling is another big item, but I think we have to put it into
>>> 9x, it’s a critical (and critically broken) functionality. We’re making
>>> some progress with Ilan and Noble so I’m cautiously optimistic.
>>> >
>>> >> On 2 Jul 2020, at 18:58, Gus Heck  wrote:
>>> >>
>>> >> Should we have one?
>>> >>
>>> >> With 9x having java 11 and gradle migrations on the dev side, and
>>> about to have a significant round of deprecations/removals and migrations
>>> to plugin for things such as CDCR, DIH etc (see
>>> https://issues.apache.org/jira/browse/SOLR-13442 and
>>> https://issues.apache.org/jira/browse/SOLR-14022) some of which may(?)
>>> need a replacement (i.e. CDCR?) or ways ot easily re-enable (i.e.
>>> streaming) before 9x is able to be released. Plus there's been talk of a
>>> revamped UI...
>>> >>
>>> >> I'm worried that there is a danger that 9x will continue to diverge
>>> and pick up major changes, but always have something big in progress and
>>> never be able to release.
>>> >>
>>> >> Perhaps we should attempt to put a box around the things that need to
>>> happen for 9x, and begin targeting any larger projects that come up at 10x?
>>> Among other things the gradle work probably can't be complete until someone
>>> has gone through a release using it. (I don't think we build the tarballs
>>> in gradle yet for example, unless that got added recently)
>>> >>
>>> >> -Gus
>>> >>
>>> >> --
>>> >> http://www.needhamsoftware.com (work)
>>> >> http://www.the111shift.com (play)
>>> >
>>>
>>>
>>> -
>>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>>> For additional commands, e-mail: dev-h...@lucene.apache.org
>>>
>>>
>>
>> --
>> http://www.needhamsoftware.com (work)
>> http://www.the111shift.com (play)
>>
>
>
> --
> http://www.needhamsoftware.com (work)
> http://www.the111shift.com (play)
>


Re: RoadMap?

2020-07-02 Thread Cassandra Targett
I’ll throw out the same suggestion I made when we had the same conversation 
about 6 months ago - we should make a 9.0 Roadmap wiki page, use it to write 
down & agree on goals, then add labels to Jira issues so we can go back to the 
wiki page and add queries which automatically query Jira and return issues and 
their status for each goal area. This use case is at least half the reason why 
a deep integration between Confluence and Jira exists.
On Jul 2, 2020, 1:39 PM -0500, Erick Erickson , wrote:
> There’s value IMO in having the discussion in one place rather than having to 
> search all of the JIRA tickets...
>
> > On Jul 2, 2020, at 2:30 PM, Gus Heck  wrote:
> >
> > Looking at 
> > https://issues.apache.org/jira/projects/SOLR?selectedItem=com.atlassian.jira.jira-projects-plugin%3Arelease-page=unreleased
> >  maybe this is as simple as having one more field on our issues... 
> > Currently fix version denotes when something got fixed, perhaps a "target 
> > version" field could indicate when we want to fix it by. Then we just need 
> > a tag in jira and perhaps a branch.
> >
> > Alternately (or maybe additionally) we could make a "board" if that's 
> > easier to monitor.
> >
> > On Thu, Jul 2, 2020 at 2:02 PM Gus Heck  wrote:
> > Jira typically has features for designating what's in a release
> >
> > On Thu, Jul 2, 2020 at 1:55 PM Erick Erickson  
> > wrote:
> > I totally expect some things to bubble up when we try to release with 
> > Gradle, the tarball being one. I don’t think that’s a very big issue, but 
> > if you have lots of “not very big” issues they do add up.
> >
> > That said, yeah, I do think it’s time to start getting a handle on 9.0. 
> > Pulling Ant out of the build system is another possibility.
> >
> > Solr as a TLP? Or is that Solr 10? Or does it even have to be as of a major 
> > release?
> >
> > Sound like a Wiki page or some such to me…
> >
> > Erick
> >
> > > On Jul 2, 2020, at 1:07 PM, Andrzej Białecki  wrote:
> > >
> > > Autoscaling is another big item, but I think we have to put it into 9x, 
> > > it’s a critical (and critically broken) functionality. We’re making some 
> > > progress with Ilan and Noble so I’m cautiously optimistic.
> > >
> > > > On 2 Jul 2020, at 18:58, Gus Heck  wrote:
> > > >
> > > > Should we have one?
> > > >
> > > > With 9x having java 11 and gradle migrations on the dev side, and about 
> > > > to have a significant round of deprecations/removals and migrations to 
> > > > plugin for things such as CDCR, DIH etc (see 
> > > > https://issues.apache.org/jira/browse/SOLR-13442 and 
> > > > https://issues.apache.org/jira/browse/SOLR-14022) some of which may(?) 
> > > > need a replacement (i.e. CDCR?) or ways ot easily re-enable (i.e. 
> > > > streaming) before 9x is able to be released. Plus there's been talk of 
> > > > a revamped UI...
> > > >
> > > > I'm worried that there is a danger that 9x will continue to diverge and 
> > > > pick up major changes, but always have something big in progress and 
> > > > never be able to release.
> > > >
> > > > Perhaps we should attempt to put a box around the things that need to 
> > > > happen for 9x, and begin targeting any larger projects that come up at 
> > > > 10x? Among other things the gradle work probably can't be complete 
> > > > until someone has gone through a release using it. (I don't think we 
> > > > build the tarballs in gradle yet for example, unless that got added 
> > > > recently)
> > > >
> > > > -Gus
> > > >
> > > > --
> > > > http://www.needhamsoftware.com (work)
> > > > http://www.the111shift.com (play)
> > >
> >
> >
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> > For additional commands, e-mail: dev-h...@lucene.apache.org
> >
> >
> >
> > --
> > http://www.needhamsoftware.com (work)
> > http://www.the111shift.com (play)
> >
> >
> > --
> > http://www.needhamsoftware.com (work)
> > http://www.the111shift.com (play)
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>


Re: RoadMap?

2020-07-02 Thread Erick Erickson
There’s value IMO in having the discussion in one place rather than having to 
search all of the JIRA tickets...

> On Jul 2, 2020, at 2:30 PM, Gus Heck  wrote:
> 
> Looking at 
> https://issues.apache.org/jira/projects/SOLR?selectedItem=com.atlassian.jira.jira-projects-plugin%3Arelease-page=unreleased
>  maybe this is as simple as having one more field on our issues... Currently 
> fix version denotes when something got fixed, perhaps a "target version" 
> field could indicate when we want to fix it by. Then we just need a tag in 
> jira and perhaps a branch. 
> 
> Alternately (or maybe additionally) we could make a "board" if that's easier 
> to monitor. 
> 
> On Thu, Jul 2, 2020 at 2:02 PM Gus Heck  wrote:
> Jira typically has features for designating what's in a release
> 
> On Thu, Jul 2, 2020 at 1:55 PM Erick Erickson  wrote:
> I totally expect some things to bubble up when we try to release with Gradle, 
> the tarball being one. I don’t think that’s a very big issue, but if you have 
> lots of “not very big” issues they do add up.
> 
> That said, yeah, I do think it’s time to start getting a handle on 9.0. 
> Pulling Ant out of the build system is another possibility.
> 
> Solr as a TLP? Or is that Solr 10? Or does it even have to be as of a major 
> release?
> 
> Sound like a Wiki page or some such to me…
> 
> Erick
> 
> > On Jul 2, 2020, at 1:07 PM, Andrzej Białecki  wrote:
> > 
> > Autoscaling is another big item, but I think we have to put it into 9x, 
> > it’s a critical (and critically broken) functionality. We’re making some 
> > progress with Ilan and Noble so I’m cautiously optimistic.
> > 
> >> On 2 Jul 2020, at 18:58, Gus Heck  wrote:
> >> 
> >> Should we have one?
> >> 
> >> With 9x having java 11 and gradle migrations on the dev side, and about to 
> >> have a significant round of deprecations/removals and migrations to plugin 
> >> for things such as CDCR, DIH etc (see 
> >> https://issues.apache.org/jira/browse/SOLR-13442 and 
> >> https://issues.apache.org/jira/browse/SOLR-14022) some of which may(?) 
> >> need a replacement (i.e. CDCR?) or ways ot easily re-enable (i.e. 
> >> streaming) before 9x is able to be released. Plus there's been talk of a 
> >> revamped UI...
> >> 
> >> I'm worried that there is a danger that 9x will continue to diverge and 
> >> pick up major changes, but always have something big in progress and never 
> >> be able to release.
> >> 
> >> Perhaps we should attempt to put a box around the things that need to 
> >> happen for 9x, and begin targeting any larger projects that come up at 
> >> 10x? Among other things the gradle work probably can't be complete until 
> >> someone has gone through a release using it. (I don't think we build the 
> >> tarballs in gradle yet for example, unless that got added recently)
> >> 
> >> -Gus
> >> 
> >> -- 
> >> http://www.needhamsoftware.com (work)
> >> http://www.the111shift.com (play)
> > 
> 
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
> 
> 
> 
> -- 
> http://www.needhamsoftware.com (work)
> http://www.the111shift.com (play)
> 
> 
> -- 
> http://www.needhamsoftware.com (work)
> http://www.the111shift.com (play)


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: RoadMap?

2020-07-02 Thread Gus Heck
Looking at
https://issues.apache.org/jira/projects/SOLR?selectedItem=com.atlassian.jira.jira-projects-plugin%3Arelease-page=unreleased
maybe
this is as simple as having one more field on our issues... Currently fix
version denotes when something got fixed, perhaps a "target version" field
could indicate when we want to fix it by. Then we just need a tag in jira
and perhaps a branch.

Alternately (or maybe additionally) we could make a "board" if that's
easier to monitor.

On Thu, Jul 2, 2020 at 2:02 PM Gus Heck  wrote:

> Jira typically has features for designating what's in a release
>
> On Thu, Jul 2, 2020 at 1:55 PM Erick Erickson 
> wrote:
>
>> I totally expect some things to bubble up when we try to release with
>> Gradle, the tarball being one. I don’t think that’s a very big issue, but
>> if you have lots of “not very big” issues they do add up.
>>
>> That said, yeah, I do think it’s time to start getting a handle on 9.0.
>> Pulling Ant out of the build system is another possibility.
>>
>> Solr as a TLP? Or is that Solr 10? Or does it even have to be as of a
>> major release?
>>
>> Sound like a Wiki page or some such to me…
>>
>> Erick
>>
>> > On Jul 2, 2020, at 1:07 PM, Andrzej Białecki  wrote:
>> >
>> > Autoscaling is another big item, but I think we have to put it into 9x,
>> it’s a critical (and critically broken) functionality. We’re making some
>> progress with Ilan and Noble so I’m cautiously optimistic.
>> >
>> >> On 2 Jul 2020, at 18:58, Gus Heck  wrote:
>> >>
>> >> Should we have one?
>> >>
>> >> With 9x having java 11 and gradle migrations on the dev side, and
>> about to have a significant round of deprecations/removals and migrations
>> to plugin for things such as CDCR, DIH etc (see
>> https://issues.apache.org/jira/browse/SOLR-13442 and
>> https://issues.apache.org/jira/browse/SOLR-14022) some of which may(?)
>> need a replacement (i.e. CDCR?) or ways ot easily re-enable (i.e.
>> streaming) before 9x is able to be released. Plus there's been talk of a
>> revamped UI...
>> >>
>> >> I'm worried that there is a danger that 9x will continue to diverge
>> and pick up major changes, but always have something big in progress and
>> never be able to release.
>> >>
>> >> Perhaps we should attempt to put a box around the things that need to
>> happen for 9x, and begin targeting any larger projects that come up at 10x?
>> Among other things the gradle work probably can't be complete until someone
>> has gone through a release using it. (I don't think we build the tarballs
>> in gradle yet for example, unless that got added recently)
>> >>
>> >> -Gus
>> >>
>> >> --
>> >> http://www.needhamsoftware.com (work)
>> >> http://www.the111shift.com (play)
>> >
>>
>>
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: dev-h...@lucene.apache.org
>>
>>
>
> --
> http://www.needhamsoftware.com (work)
> http://www.the111shift.com (play)
>


-- 
http://www.needhamsoftware.com (work)
http://www.the111shift.com (play)


Re: RoadMap?

2020-07-02 Thread Gus Heck
Jira typically has features for designating what's in a release

On Thu, Jul 2, 2020 at 1:55 PM Erick Erickson 
wrote:

> I totally expect some things to bubble up when we try to release with
> Gradle, the tarball being one. I don’t think that’s a very big issue, but
> if you have lots of “not very big” issues they do add up.
>
> That said, yeah, I do think it’s time to start getting a handle on 9.0.
> Pulling Ant out of the build system is another possibility.
>
> Solr as a TLP? Or is that Solr 10? Or does it even have to be as of a
> major release?
>
> Sound like a Wiki page or some such to me…
>
> Erick
>
> > On Jul 2, 2020, at 1:07 PM, Andrzej Białecki  wrote:
> >
> > Autoscaling is another big item, but I think we have to put it into 9x,
> it’s a critical (and critically broken) functionality. We’re making some
> progress with Ilan and Noble so I’m cautiously optimistic.
> >
> >> On 2 Jul 2020, at 18:58, Gus Heck  wrote:
> >>
> >> Should we have one?
> >>
> >> With 9x having java 11 and gradle migrations on the dev side, and about
> to have a significant round of deprecations/removals and migrations to
> plugin for things such as CDCR, DIH etc (see
> https://issues.apache.org/jira/browse/SOLR-13442 and
> https://issues.apache.org/jira/browse/SOLR-14022) some of which may(?)
> need a replacement (i.e. CDCR?) or ways ot easily re-enable (i.e.
> streaming) before 9x is able to be released. Plus there's been talk of a
> revamped UI...
> >>
> >> I'm worried that there is a danger that 9x will continue to diverge and
> pick up major changes, but always have something big in progress and never
> be able to release.
> >>
> >> Perhaps we should attempt to put a box around the things that need to
> happen for 9x, and begin targeting any larger projects that come up at 10x?
> Among other things the gradle work probably can't be complete until someone
> has gone through a release using it. (I don't think we build the tarballs
> in gradle yet for example, unless that got added recently)
> >>
> >> -Gus
> >>
> >> --
> >> http://www.needhamsoftware.com (work)
> >> http://www.the111shift.com (play)
> >
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>

-- 
http://www.needhamsoftware.com (work)
http://www.the111shift.com (play)


Re: RoadMap?

2020-07-02 Thread Erick Erickson
I totally expect some things to bubble up when we try to release with Gradle, 
the tarball being one. I don’t think that’s a very big issue, but if you have 
lots of “not very big” issues they do add up.

That said, yeah, I do think it’s time to start getting a handle on 9.0. Pulling 
Ant out of the build system is another possibility.

Solr as a TLP? Or is that Solr 10? Or does it even have to be as of a major 
release?

Sound like a Wiki page or some such to me…

Erick

> On Jul 2, 2020, at 1:07 PM, Andrzej Białecki  wrote:
> 
> Autoscaling is another big item, but I think we have to put it into 9x, it’s 
> a critical (and critically broken) functionality. We’re making some progress 
> with Ilan and Noble so I’m cautiously optimistic.
> 
>> On 2 Jul 2020, at 18:58, Gus Heck  wrote:
>> 
>> Should we have one?
>> 
>> With 9x having java 11 and gradle migrations on the dev side, and about to 
>> have a significant round of deprecations/removals and migrations to plugin 
>> for things such as CDCR, DIH etc (see 
>> https://issues.apache.org/jira/browse/SOLR-13442 and 
>> https://issues.apache.org/jira/browse/SOLR-14022) some of which may(?) need 
>> a replacement (i.e. CDCR?) or ways ot easily re-enable (i.e. streaming) 
>> before 9x is able to be released. Plus there's been talk of a revamped UI...
>> 
>> I'm worried that there is a danger that 9x will continue to diverge and pick 
>> up major changes, but always have something big in progress and never be 
>> able to release.
>> 
>> Perhaps we should attempt to put a box around the things that need to happen 
>> for 9x, and begin targeting any larger projects that come up at 10x? Among 
>> other things the gradle work probably can't be complete until someone has 
>> gone through a release using it. (I don't think we build the tarballs in 
>> gradle yet for example, unless that got added recently)
>> 
>> -Gus
>> 
>> -- 
>> http://www.needhamsoftware.com (work)
>> http://www.the111shift.com (play)
> 


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: RoadMap?

2020-07-02 Thread Andrzej Białecki
Autoscaling is another big item, but I think we have to put it into 9x, it’s a 
critical (and critically broken) functionality. We’re making some progress with 
Ilan and Noble so I’m cautiously optimistic.

> On 2 Jul 2020, at 18:58, Gus Heck  wrote:
> 
> Should we have one?
> 
> With 9x having java 11 and gradle migrations on the dev side, and about to 
> have a significant round of deprecations/removals and migrations to plugin 
> for things such as CDCR, DIH etc (see 
> https://issues.apache.org/jira/browse/SOLR-13442 
>  and 
> https://issues.apache.org/jira/browse/SOLR-14022 
> ) some of which may(?) need 
> a replacement (i.e. CDCR?) or ways ot easily re-enable (i.e. streaming) 
> before 9x is able to be released. Plus there's been talk of a revamped UI...
> 
> I'm worried that there is a danger that 9x will continue to diverge and pick 
> up major changes, but always have something big in progress and never be able 
> to release.
> 
> Perhaps we should attempt to put a box around the things that need to happen 
> for 9x, and begin targeting any larger projects that come up at 10x? Among 
> other things the gradle work probably can't be complete until someone has 
> gone through a release using it. (I don't think we build the tarballs in 
> gradle yet for example, unless that got added recently)
> 
> -Gus
> 
> -- 
> http://www.needhamsoftware.com  (work)
> http://www.the111shift.com  (play)



Re: RoadMap?

2020-07-02 Thread Atri Sharma
+1

On Thu, Jul 2, 2020 at 10:28 PM Gus Heck  wrote:
>
> Should we have one?
>
> With 9x having java 11 and gradle migrations on the dev side, and about to 
> have a significant round of deprecations/removals and migrations to plugin 
> for things such as CDCR, DIH etc (see 
> https://issues.apache.org/jira/browse/SOLR-13442 and 
> https://issues.apache.org/jira/browse/SOLR-14022) some of which may(?) need a 
> replacement (i.e. CDCR?) or ways ot easily re-enable (i.e. streaming) before 
> 9x is able to be released. Plus there's been talk of a revamped UI...
>
> I'm worried that there is a danger that 9x will continue to diverge and pick 
> up major changes, but always have something big in progress and never be able 
> to release.
>
> Perhaps we should attempt to put a box around the things that need to happen 
> for 9x, and begin targeting any larger projects that come up at 10x? Among 
> other things the gradle work probably can't be complete until someone has 
> gone through a release using it. (I don't think we build the tarballs in 
> gradle yet for example, unless that got added recently)
>
> -Gus
>
> --
> http://www.needhamsoftware.com (work)
> http://www.the111shift.com (play)

-- 
Regards,

Atri
Apache Concerted

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



RoadMap?

2020-07-02 Thread Gus Heck
Should we have one?

With 9x having java 11 and gradle migrations on the dev side, and about to
have a significant round of deprecations/removals and migrations to plugin
for things such as CDCR, DIH etc (see
https://issues.apache.org/jira/browse/SOLR-13442 and
https://issues.apache.org/jira/browse/SOLR-14022) some of which may(?) need
a replacement (i.e. CDCR?) or ways ot easily re-enable (i.e. streaming)
before 9x is able to be released. Plus there's been talk of a revamped UI...

I'm worried that there is a danger that 9x will continue to diverge and
pick up major changes, but always have something big in progress and never
be able to release.

Perhaps we should attempt to put a box around the things that need to
happen for 9x, and begin targeting any larger projects that come up at 10x?
Among other things the gradle work probably can't be complete until someone
has gone through a release using it. (I don't think we build the tarballs
in gradle yet for example, unless that got added recently)

-Gus

-- 
http://www.needhamsoftware.com (work)
http://www.the111shift.com (play)


[jira] [Updated] (SOLR-11917) A Potential Roadmap for robust multi-analyzer TextFields w/various options for configuring docValues

2018-01-31 Thread Cassandra Targett (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-11917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cassandra Targett updated SOLR-11917:
-
Component/s: Schema and Analysis

> A Potential Roadmap for robust multi-analyzer TextFields w/various options 
> for configuring docValues
> 
>
> Key: SOLR-11917
> URL: https://issues.apache.org/jira/browse/SOLR-11917
> Project: Solr
>  Issue Type: Wish
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Schema and Analysis
>Reporter: Hoss Man
>Assignee: Hoss Man
>Priority: Major
>
> A while back, I was tasked at my day job to brainstorm & design some "smarter 
> field types" in Solr. In particular to think about:
>  # How to simplify some of the "special things" people have to know about 
> Solr behavior when creating their schemas
>  # How to reduce the number of situations where users have to copy/clone one 
> "logical field" into multiple "schema felds in order to meet diff use cases
> The main result of this thought excercise is a handful of usecases/goals that 
> people seem to have - many of which are already tracked in existing jiras - 
> along with a high level design/roadmap of potential solutions for these goals 
> that can be implemented incrementally to leverage some common changes (and 
> what those changes might look like).
> My intention is to use this jira as a place to share these ideas for broader 
> community discussion, and as a central linkage point for the related jiras. 
> (details to follow in a very looong comment)
> 
> NOTE: I am not (at this point) personally committing to following through on 
> implementing every aspect of these ideas :)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-11917) A Potential Roadmap for robust multi-analyzer TextFields w/various options for configuring docValues

2018-01-26 Thread Hoss Man (JIRA)
nalyzer":{
"class":"solr.KeywordAnalyzer" },
 "what everAnalyzer":{
"class":"solr.WhateverAnalyzer" }
}
{code}
...obviously we'd need to include special handling for the existing 
single-analyzer case of {{"analyzer"}} as the attribute name – but that would 
be trivial

  *NOTE:* I feel gross even suggesting this syntax, but it is something 
that would be easily achievable
  In reality, if we did decide to "version" the response structure, this 
kind of "shim" structure (and the code to read/write it) might be needed anyway 
(example: in case a client requesting the "old" format hits a Solr where there 
are some FieldTypes w/arbitrary analyzers)

With the above changes, it would be a fairly simple matter to make methods like 
{{TextField.getFieldQuery}} ask the QParser if there is an {{analyzer}} 
localparam, and if so use it instead of the \{{query} analyzer when parsing 
queries. As before: the *S2.1.HITCH* still applies and would also need to be 
"fixed" for this to work with non-trivila QParsers.
h3. *S2.2E*: Example Usage

In addition to how this might simplify *S2.1.STRAW1* (above) and *S1.3* (below) 
this would also make the expert level "I want to be able to have lots of 
arbitrary analyzers I pick between arbitrarily at query time..." usecase very 
clean...
{code:xml}



  
   ...
  
  
   ...
  
  
   ...
  
  
   ...
  

{code}

h2. *S1.3*: TextField supports a type="docValues" analyzer
h3. *S1.3G*: Goal

{{TextField}} should support {{docValues="true"}} and it should be possible to 
configure an arbitrary analyzer (ie: {{}} ) for 
determining what data gets put in the docValues of a text field for 
sorting/faceting – independently from the the index/query analyzer)

For back compatibility, a TextField with {{docValues="true"}} but no 
{{type="docValues"}} analyzer should use the {{type="index"}} analyzer.
h3. *S1.3E*: Example Usage

Building DocValues at index time to get same behavior as FieldCache w/o the 
query RAM/time cost...
{code:java}




  
   ...
  
  
   ...
  

{code}
Special Faceting/Sorting behavior, independent of search behavior...
{code:java}


   




  
  


  

{code}
{code:java}


   
...
  
  



  

{code}
h3. *S1.3A*: Suggested Approach
 * 90% of the work required here would be the bulk of the work from *S2.2*: 
Supporting the configuration of a "docValues" analyzer on TextField
 ** It would be silly to attempt this *S1.3* idea w/o also tackling the 
generalized idea in *S2.2A* (ie: we should stop adding special magic named 
analyzers to the logic in IndexSchema – the "plumbing" should be generic, and 
the FieldTypes should know what analyzer names they care about)
 * Once the plumbing is in place for IndexSchema to allow arbitrary named 
analyzers, the remaining work would be fairly trivial:
 ** Make TextField expect an analyzer named "docValues", implicitly defaulting 
to the effective "index" analyzer if there is no explicit "docValues" analyzer
 ** Refactor the core bit of logic from *S1.2A* (TermDocValuesTextField's 
"TokenStream -> SortedSetDocValuesField") down to TextField
 *** Either deprecate TermDocValuesTextField, or leave it as syntactic sugar 
for TextField w/ {{docValues="true"}}
 ** Refactor SortableTextField (*S1.1A*) so that it's just syntactic sugar for 
TextField with:
 *** {{docValues="true"}}
 *** a "docValues" analyzer using KeywordTokenizer + a new "Truncation" filter 
(to limit the values to the confiured # of characters)

> A Potential Roadmap for robust multi-analyzer TextFields w/various options 
> for configuring docValues
> --------
>
> Key: SOLR-11917
> URL: https://issues.apache.org/jira/browse/SOLR-11917
> Project: Solr
>  Issue Type: Wish
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Hoss Man
>Assignee: Hoss Man
>Priority: Major
>
> A while back, I was tasked at my day job to brainstorm & design some "smarter 
> field types" in Solr. In particular to think about:
>  # How to simplify some of the "special things" people have to know about 
> Solr behavior when creating their schemas
>  # How to reduce the number of situations where users have to copy/clone one 
> "logical field" into multiple "schema felds in order to meet diff use cases
> The main result of this thought excercise is a handful of usecases/goals that 
> people seem to have - many of which are already tracked in existing jiras - 
> along with a high level design/roadmap of potential solutions for these goals 
> that can be implemented incrementally to leverage some common changes (and 
> what those changes might look like).
> My intention is to use this jira as a place to share these ideas for broader 
> community discussion, and as a central linkage point for the related jiras. 
> (details to follow in a very looong comment)
> 
> NOTE: I am not (at this point) personally committing to following through on 
> implementing every aspect of these ideas :)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-11917) A Potential Roadmap for robust multi-analyzer TextFields w/various options for configuring docValues

2018-01-26 Thread Hoss Man (JIRA)
below), then the schema syntax could 
potentially be simplified to remove the "lang -> some other fieldType name" 
mapping and instead use lots of nested analyzers named after each langauge
 *** this might still be a bit confusing however if people want diff 
index/query(/multiTerm) analyzers for each langague ... would have to use some 
sort of regid naming convention?

{panel}
*NOTE:* If either strawman is implemented, we should strongly consider 
including an additional option/subclass of this new "*LangAwareTextField" to 
automatically use the langid plugin code at query time to try and "guess" the 
lang if it isn't specified in a 'lang' local/request params
 * at least for language-detect (latest version), there are special models 
built just for short inputs
 * we could potentially make the code use the guessed lang at query time only 
if above some configured confidence:
 ** or: if explicit 'lang' param, use only that lang – but if the langauge is 
guessed, query using both the field/analyzer for that specific lang as well as 
the 'default' field/analyzer{panel}
 

 

> A Potential Roadmap for robust multi-analyzer TextFields w/various options 
> for configuring docValues
> 
>
> Key: SOLR-11917
> URL: https://issues.apache.org/jira/browse/SOLR-11917
> Project: Solr
>  Issue Type: Wish
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Hoss Man
>Assignee: Hoss Man
>Priority: Major
>
> A while back, I was tasked at my day job to brainstorm & design some "smarter 
> field types" in Solr. In particular to think about:
>  # How to simplify some of the "special things" people have to know about 
> Solr behavior when creating their schemas
>  # How to reduce the number of situations where users have to copy/clone one 
> "logical field" into multiple "schema felds in order to meet diff use cases
> The main result of this thought excercise is a handful of usecases/goals that 
> people seem to have - many of which are already tracked in existing jiras - 
> along with a high level design/roadmap of potential solutions for these goals 
> that can be implemented incrementally to leverage some common changes (and 
> what those changes might look like).
> My intention is to use this jira as a place to share these ideas for broader 
> community discussion, and as a central linkage point for the related jiras. 
> (details to follow in a very looong comment)
> 
> NOTE: I am not (at this point) personally committing to following through on 
> implementing every aspect of these ideas :)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-11917) A Potential Roadmap for robust multi-analyzer TextFields w/various options for configuring docValues

2018-01-26 Thread Hoss Man (JIRA)
m the (original) TokenStream ... needs 
experimentation/refactoring once we have some tests.

NOTE: If/when *S1.3A* is implemented, this TermDocValuesTextField could be 
refactored to be syntactic sugar for TextField w/ some added defaults – see 
below.

 

 

> A Potential Roadmap for robust multi-analyzer TextFields w/various options 
> for configuring docValues
> 
>
> Key: SOLR-11917
> URL: https://issues.apache.org/jira/browse/SOLR-11917
> Project: Solr
>  Issue Type: Wish
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Hoss Man
>Assignee: Hoss Man
>Priority: Major
>
> A while back, I was tasked at my day job to brainstorm & design some "smarter 
> field types" in Solr. In particular to think about:
>  # How to simplify some of the "special things" people have to know about 
> Solr behavior when creating their schemas
>  # How to reduce the number of situations where users have to copy/clone one 
> "logical field" into multiple "schema felds in order to meet diff use cases
> The main result of this thought excercise is a handful of usecases/goals that 
> people seem to have - many of which are already tracked in existing jiras - 
> along with a high level design/roadmap of potential solutions for these goals 
> that can be implemented incrementally to leverage some common changes (and 
> what those changes might look like).
> My intention is to use this jira as a place to share these ideas for broader 
> community discussion, and as a central linkage point for the related jiras. 
> (details to follow in a very looong comment)
> 
> NOTE: I am not (at this point) personally committing to following through on 
> implementing every aspect of these ideas :)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-11917) A Potential Roadmap for robust multi-analyzer TextFields w/various options for configuring docValues

2018-01-26 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16341681#comment-16341681
 ] 

Hoss Man commented on SOLR-11917:
-

h2. Hoss'ss High Level Thoughts on these Goals / *U*secases

While it's certainly possible to takle some of these objectives independently 
of the others, either from a standpoint of of incremental feature delivery, or 
from a standpoint of "end user ease of use" there definitely seems to be some 
overlap here that's worth considering.

In particular:
 * While there is certainly some non-trivial set of possible implementations 
that can satisfy both *U2.1* and *U2.2*, my gut impression is that no one 
implementation will really fit both usecases well in an easy to use/understand 
way. I'm also pretty confident that the "multi-language" use cases would be 
easier to solve/build in a "clean" (and easy for users to understand) approach 
more simply / quickly then any (non-silly) solutions that would support the 
"let me shoot my self in the foot if I want" objectives.
 * While I personally don't feel that the *U2.2* usecase is a particularly good 
idea, the overall "plumbing" involved in supporting this type of usecase would 
be very helpful towards supporting *U1.3*
 * Likewise: *U1.1* and *U1.2* should be easy be implement as new FieldTypes 
independent from the more complex needs of *U1.3*. But if *U1.3* was possible, 
then there would likely be potential for refactoring to reduce common code and 
simplify the implementations.

 

> A Potential Roadmap for robust multi-analyzer TextFields w/various options 
> for configuring docValues
> 
>
> Key: SOLR-11917
> URL: https://issues.apache.org/jira/browse/SOLR-11917
> Project: Solr
>  Issue Type: Wish
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Hoss Man
>Assignee: Hoss Man
>Priority: Major
>
> A while back, I was tasked at my day job to brainstorm & design some "smarter 
> field types" in Solr. In particular to think about:
>  # How to simplify some of the "special things" people have to know about 
> Solr behavior when creating their schemas
>  # How to reduce the number of situations where users have to copy/clone one 
> "logical field" into multiple "schema felds in order to meet diff use cases
> The main result of this thought excercise is a handful of usecases/goals that 
> people seem to have - many of which are already tracked in existing jiras - 
> along with a high level design/roadmap of potential solutions for these goals 
> that can be implemented incrementally to leverage some common changes (and 
> what those changes might look like).
> My intention is to use this jira as a place to share these ideas for broader 
> community discussion, and as a central linkage point for the related jiras. 
> (details to follow in a very looong comment)
> 
> NOTE: I am not (at this point) personally committing to following through on 
> implementing every aspect of these ideas :)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-11917) A Potential Roadmap for robust multi-analyzer TextFields w/various options for configuring docValues

2018-01-26 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16341679#comment-16341679
 ] 

Hoss Man commented on SOLR-11917:
-

_The following notes were compiled over many months and iteratively 
tweaked/revised – It's likeley that in some cases my comments may be 
overlooking/ignorant-of comments/ideas/patches related to some of these 
concepts that were posted added after I wrote them that i just haven't noticed 
since._

_Also: Jira says my notes are too long for one comment, so i have to break it 
up into sections_

h1. High Level Goals / *U*secases

Talking to various customers about their pain points, and reading up on various 
jiras lead me to a handful of Text/String related *U*secases that all seemed 
like they have solutions that could either overlapp, or be in close proximity, 
when it came to implementation:
 * *U0*: "I want sane defaults when sorting on multivalued fields, not an error"
 ** Low hanging fruit already implemented for PrimitiveFieldType subclasses in 
SOLR-11854
 * *U1*: (SOLR-8362) Add docValues support to TextField (or some new subclasses 
of TextField) – Because...
 ** *U1.1*: "I want to be able to (efficiently) sort on the original input of a 
TextField (using docValues)"
 ** *U1.2*: "I want to be able to (efficiently) facet on (docValues built from) 
the indexed terms of a TextField
 ** *U1.3*: "I want to be able to (efficiently) sort/facet on docValues built 
from analyzed terms using a completely diff analyzer then what i use for 
searching"
 *** Example: StandardAnalyzer for searching, but lowercased docValues for 
sorting.
 * *U2*: Choose Query Analysis Aspects At Query Time – Because...
 ** *U2.1*: "I want to be able to do multi-language indexing/querying easily so 
it only looks like one 'field' name." (SOLR-6492)
 ** *U2.2*: "I want to be able to have lots of arbitrary analyzers I pick 
between arbitrarily at query time and maybe shoot myself in the foot but it's 
ok i'm an expert and i have special needs." (SOLR-5053)
 *** NOTE: the description of SOLR-5053 does also list multi-lang as a 
motivation, but some of the examples – like "ignore synonyms" – are definitely 
broader scope then this.

 

 

> A Potential Roadmap for robust multi-analyzer TextFields w/various options 
> for configuring docValues
> 
>
> Key: SOLR-11917
> URL: https://issues.apache.org/jira/browse/SOLR-11917
> Project: Solr
>  Issue Type: Wish
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Hoss Man
>Assignee: Hoss Man
>Priority: Major
>
> A while back, I was tasked at my day job to brainstorm & design some "smarter 
> field types" in Solr. In particular to think about:
>  # How to simplify some of the "special things" people have to know about 
> Solr behavior when creating their schemas
>  # How to reduce the number of situations where users have to copy/clone one 
> "logical field" into multiple "schema felds in order to meet diff use cases
> The main result of this thought excercise is a handful of usecases/goals that 
> people seem to have - many of which are already tracked in existing jiras - 
> along with a high level design/roadmap of potential solutions for these goals 
> that can be implemented incrementally to leverage some common changes (and 
> what those changes might look like).
> My intention is to use this jira as a place to share these ideas for broader 
> community discussion, and as a central linkage point for the related jiras. 
> (details to follow in a very looong comment)
> 
> NOTE: I am not (at this point) personally committing to following through on 
> implementing every aspect of these ideas :)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-11917) A Potential Roadmap for robust multi-analyzer TextFields w/various options for configuring docValues

2018-01-26 Thread Hoss Man (JIRA)
Hoss Man created SOLR-11917:
---

 Summary: A Potential Roadmap for robust multi-analyzer TextFields 
w/various options for configuring docValues
 Key: SOLR-11917
 URL: https://issues.apache.org/jira/browse/SOLR-11917
 Project: Solr
  Issue Type: Wish
  Security Level: Public (Default Security Level. Issues are Public)
Reporter: Hoss Man
Assignee: Hoss Man


A while back, I was tasked at my day job to brainstorm & design some "smarter 
field types" in Solr. In particular to think about:
 # How to simplify some of the "special things" people have to know about Solr 
behavior when creating their schemas
 # How to reduce the number of situations where users have to copy/clone one 
"logical field" into multiple "schema felds in order to meet diff use cases

The main result of this thought excercise is a handful of usecases/goals that 
people seem to have - many of which are already tracked in existing jiras - 
along with a high level design/roadmap of potential solutions for these goals 
that can be implemented incrementally to leverage some common changes (and what 
those changes might look like).

My intention is to use this jira as a place to share these ideas for broader 
community discussion, and as a central linkage point for the related jiras. 
(details to follow in a very looong comment)

NOTE: I am not (at this point) personally committing to following through on 
implementing every aspect of these ideas :)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Roadmap for fixing features broken by core autodiscovery

2013-10-05 Thread Erick Erickson
Right, let's move this discussion to SOLR-4779. There's some history
here. Sharing named config sets got a bit wrapped up in sharing the
underlying solrconfig object. This latter has been taken off the
table, but we should discuss fixing Trey's issues up. Here's what the
thinking was:
There would be a directory like solr_home/configs/configset1,
solr_home/configs/configset2, etc. Then a new parameter for
core.properties or create or whatever like configset=configset1 that
would be smart enough to look in solr_home/configs for an entire
conf directory named configste1.

Trey:
Does that work for your case? If so, please add your comments to 4779
and we can take it from there. FWIW, I don't think this is especially
hard, but time is always at a premium.

Erick

On Fri, Oct 4, 2013 at 6:51 PM, Shawn Heisey s...@elyograg.org wrote:
 On 10/4/2013 7:21 PM, Trey Grainger wrote:
 There are two use-cases that appear broken with the new core
 auto-discovery mechanism:

 *1) The Core Admin Handler's CREATE command no longer works to create
 brand new cores*
 (unless you have logged on the box and created the core's directory
 structure manually, which largely defeats the purpose of the CREATE
 command).  With the old Solr.xml format, we could spin up as many cores
 as we wanted to dynamically with the following command:
 http://localhost:8983/solr/admin/cores?action=CREATEname=newCore1instanceDir=collection1dataDir=newCore1/data
 ...
 http://localhost:8983/solr/admin/cores?action=CREATEname=newCoreNinstanceDir=collection1dataDir=newCoreN/data

 In the new core discovery mode, this exception is now thrown:
 Error CREATEing SolrCore 'newCore1': Could not create a new core in
 solr/collection1/as another core is already defined there

 The CREATE action has *always* required that you have your configuration
 on the disk before you call it.  You are sharing the instanceDir, which
 is the only reason you can skip that step.

 If you want completely dynamic creation, use SolrCloud, which keeps the
 config in zookeeper and requires ZERO config information to exist on the
 disk.

 *2) Having a shared configuration directory (instanceDir) across many
 cores no longer works*.
 Every core has to have it's own conf/ directory, and this doesn't seem
 to be overridable any longer.  Previously, it was possible to have many
 cores share the same instanceDir (and just override their dataDir for
 obvious reasons).  Now, it is necessary to copy and paste identical
 config files for each Solr core.

 From what I understand talking to the people that worked on this, the
 lack of a shared instanceDir was completely deliberate.  It's the only
 way that core discovery can work in any kind of predictable and sane
 manner.  The entire point of it is that every core is self-contained and
 solr.xml isn't used to tell Solr about them.

 I personally have never tried to share the instanceDir.  I do have
 shared configs, though - my corename/conf directories have symlinks to a
 shared config directory.  I also don't dynamically create cores - I have
 seven shards, each of which has a live core and a build core.  There are
 two other cores that serve as frontends, with the shards parameter in
 the request handlers.

 I don't know if there's already a current roadmap for fixing this.  I
 saw https://issues.apache.org/jira/browse/SOLR-4478, which suggested
 replacing instanceDir with the ability to specify a named configSet.
  This solves problem 2, but not problem1 (since you still can't have
 multiple core.properties files in the same folder).  Based on Erick's
 comments in the JIRA ticket, it also sounds like this ticket is also
 dead at the moment.

 There is definitely a need to have a shared config directory - whether
 that is through a configSet or an explicit indexDir doesn't matter to
 me.  There's also a need to be able to dynamically create Solr cores
 from external systems.  I currently can't upgrade to core auto discovery
 because it doesn't allow dynamic core creation.  Does anyone have some
 thoughts on how to best get these features working again under core
 autodiscovery?  Adding instanceDir to core.properties seems like an easy
 solution, but there must be a desire not to do that or it would probably
 have already been done.

 Thankfully, you do not need to upgrade to core discovery anytime soon.
 All future 4.x versions will support the old format, and any problems
 with that will be considered bugs.  It will be mandatory in Solr 5.0,
 which currently doesn't have any kind of release roadmap or timeframe.
 I suspect that what we currently call SolrCloud will also be mandatory
 in 5.0, and that gives you shared configs with zookeeper.  Requiring
 zookeeper allows completely dynamic core/collection creation, because
 the only thing that will be on the disk is the index and transaction log
 data.

 Thanks,
 Shawn


 -
 To unsubscribe, e-mail: dev-unsubscr

Roadmap for fixing features broken by core autodiscovery

2013-10-04 Thread Trey Grainger
There are two use-cases that appear broken with the new core auto-discovery
mechanism:

*1) The Core Admin Handler's CREATE command no longer works to create brand
new cores*
(unless you have logged on the box and created the core's directory
structure manually, which largely defeats the purpose of the CREATE
command).  With the old Solr.xml format, we could spin up as many cores as
we wanted to dynamically with the following command:
http://localhost:8983/solr/admin/cores?action=CREATEname=newCore1;
instanceDir=collection1dataDir=newCore1/data
...
http://localhost:8983/solr/admin/cores?action=CREATEname=newCoreN;
instanceDir=collection1dataDir=newCoreN/data

In the new core discovery mode, this exception is now thrown:
Error CREATEing SolrCore 'newCore1': Could not create a new core in
solr/collection1/as another core is already defined there

The exception is being intentionally thrown in CorePropertiesLocator.java
because a core.properties file already exists in solr/collection1 (and only
one can exist per directory).


*2) Having a shared configuration directory (instanceDir) across many cores
no longer works*.
Every core has to have it's own conf/ directory, and this doesn't seem to
be overridable any longer.  Previously, it was possible to have many cores
share the same instanceDir (and just override their dataDir for obvious
reasons).  Now, it is necessary to copy and paste identical config files
for each Solr core.


I don't know if there's already a current roadmap for fixing this.  I saw
https://issues.apache.org/jira/browse/SOLR-4478, which suggested replacing
instanceDir with the ability to specify a named configSet.  This solves
problem 2, but not problem1 (since you still can't have multiple
core.properties files in the same folder).  Based on Erick's comments in
the JIRA ticket, it also sounds like this ticket is also dead at the moment.

There is definitely a need to have a shared config directory - whether that
is through a configSet or an explicit indexDir doesn't matter to me.
 There's also a need to be able to dynamically create Solr cores from
external systems.  I currently can't upgrade to core auto discovery because
it doesn't allow dynamic core creation.  Does anyone have some thoughts on
how to best get these features working again under core autodiscovery?
 Adding instanceDir to core.properties seems like an easy solution, but
there must be a desire not to do that or it would probably have already
been done.

I'm happy to contribute some time to resolving this if there is agreed upon
path forward.


Thanks,

-Trey


Re: Roadmap for fixing features broken by core autodiscovery

2013-10-04 Thread Shawn Heisey
On 10/4/2013 7:21 PM, Trey Grainger wrote:
 There are two use-cases that appear broken with the new core
 auto-discovery mechanism:
 
 *1) The Core Admin Handler's CREATE command no longer works to create
 brand new cores* 
 (unless you have logged on the box and created the core's directory
 structure manually, which largely defeats the purpose of the CREATE
 command).  With the old Solr.xml format, we could spin up as many cores
 as we wanted to dynamically with the following command:
 http://localhost:8983/solr/admin/cores?action=CREATEname=newCore1instanceDir=collection1dataDir=newCore1/data
 ...
 http://localhost:8983/solr/admin/cores?action=CREATEname=newCoreNinstanceDir=collection1dataDir=newCoreN/data
 
 In the new core discovery mode, this exception is now thrown:
 Error CREATEing SolrCore 'newCore1': Could not create a new core in
 solr/collection1/as another core is already defined there

The CREATE action has *always* required that you have your configuration
on the disk before you call it.  You are sharing the instanceDir, which
is the only reason you can skip that step.

If you want completely dynamic creation, use SolrCloud, which keeps the
config in zookeeper and requires ZERO config information to exist on the
disk.

 *2) Having a shared configuration directory (instanceDir) across many
 cores no longer works*.  
 Every core has to have it's own conf/ directory, and this doesn't seem
 to be overridable any longer.  Previously, it was possible to have many
 cores share the same instanceDir (and just override their dataDir for
 obvious reasons).  Now, it is necessary to copy and paste identical
 config files for each Solr core.

From what I understand talking to the people that worked on this, the
lack of a shared instanceDir was completely deliberate.  It's the only
way that core discovery can work in any kind of predictable and sane
manner.  The entire point of it is that every core is self-contained and
solr.xml isn't used to tell Solr about them.

I personally have never tried to share the instanceDir.  I do have
shared configs, though - my corename/conf directories have symlinks to a
shared config directory.  I also don't dynamically create cores - I have
seven shards, each of which has a live core and a build core.  There are
two other cores that serve as frontends, with the shards parameter in
the request handlers.

 I don't know if there's already a current roadmap for fixing this.  I
 saw https://issues.apache.org/jira/browse/SOLR-4478, which suggested
 replacing instanceDir with the ability to specify a named configSet.
  This solves problem 2, but not problem1 (since you still can't have
 multiple core.properties files in the same folder).  Based on Erick's
 comments in the JIRA ticket, it also sounds like this ticket is also
 dead at the moment.
 
 There is definitely a need to have a shared config directory - whether
 that is through a configSet or an explicit indexDir doesn't matter to
 me.  There's also a need to be able to dynamically create Solr cores
 from external systems.  I currently can't upgrade to core auto discovery
 because it doesn't allow dynamic core creation.  Does anyone have some
 thoughts on how to best get these features working again under core
 autodiscovery?  Adding instanceDir to core.properties seems like an easy
 solution, but there must be a desire not to do that or it would probably
 have already been done.

Thankfully, you do not need to upgrade to core discovery anytime soon.
All future 4.x versions will support the old format, and any problems
with that will be considered bugs.  It will be mandatory in Solr 5.0,
which currently doesn't have any kind of release roadmap or timeframe.
I suspect that what we currently call SolrCloud will also be mandatory
in 5.0, and that gives you shared configs with zookeeper.  Requiring
zookeeper allows completely dynamic core/collection creation, because
the only thing that will be on the disk is the index and transaction log
data.

Thanks,
Shawn


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Solr 5.0 roadmap added to wiki

2013-09-03 Thread Shawn Heisey
I added a roadmap section to the Solr5.0 page.  At this moment I can
only think of one major thing we are planning for the 5.0 release, and I
put it on there.

https://wiki.apache.org/solr/Solr5.0

If that should be a separate page rather than part of the main 5.0 page,
I'm perfectly OK with it being changed.

Thanks,
Shawn

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



RE: [Lucene.Net] Roadmap

2011-11-24 Thread Christopher Currens
Yes, a lot of it is done.  Porting highlighter is partially done, not
committed, because it relies on the memory contrive package in java, which
I've also ported, but the tests fail.  The last contrib project I've worked
on was snowball.  If you look at the commit log, I've tried to mention what
contrive I worked on.  Those and highlighter/memory are all I've done, the
rest is up for grabs.

I just finished a 12 hour drive from Portland to San Francisco, so I don't
know how legible the above is.  I'll take another look at what I've done
and what needs to be done tomorrow or so, but I think its pretty accurate.

- Christopher
On Nov 23, 2011 10:53 PM, Prescott Nasser geobmx...@hotmail.com wrote:


 Something else we need to consider is that topScore and TopScore is
 perfectly valid for a function and field name in the same class, but it
 will never be CLS compliant, and VB wouldn't work with Lucene.Net as is.



 
  Date: Tue, 22 Nov 2011 09:42:03 -0800
  From: currens.ch...@gmail.com
  To: lucene-net-dev@lucene.apache.org
  Subject: Re: [Lucene.Net] Roadmap
 
  Regarding the short term goals that Scott mentioned, I agree. I think
 over
  the past 9 months that we've been active, it's time we see what we need
 to
  do to graduate from the incubator. Also, 3.0.3 is actually close to a
  release, *depending* on how we feel about the Contrib libraries, which
 I'll
  discuss in a separate thread.
 
  Scott didn't mention directly, but I think it would be good to port the
 3.x
  branch past 3.0.3. Lucene has released 3.1, 3.2, 3.3, and 3.4 in addition
  to 3.0.3. Whether this means we release all those versions, or just port
  up to 3.4 and just release it, that's something we'd all have to agree
  upon. I want to get a 3.x branch up to where Java's is. Also, deciding if
  porting 4.0 can happen at the same time as 3.x is worked on and how to go
  about it, particularly how far we want to diverge from java. Either way,
 I
  think maintaining both 3.x and 4.x would be a good thing for the
 community
  to have.
 
 
  On Tue, Nov 22, 2011 at 8:56 AM, Scott Lombard lombardena...@gmail.com
 wrote:
 
   Mike,
  
   You're right about putting together a higher level discussion. Here are
   the
   road map items I see. I am interested in other have to say.
  
   None of the items I have listed are contigent on the other so they can
 be
   done in parallel or out of order.
  
  
   1) Complete the release of 2.9.4
   2) Create and release 3.0.3
  
   3) Graduate from the incubator
   4) Document a porting process that the community can reference.
   5) Port 4.0
  
  
  
   Scott
  
-Original Message-
From: Michael Herndon [mailto:mhern...@wickedsoftware.net]
Sent: Tuesday, November 22, 2011 10:28 AM
To: lucene-net-dev@lucene.apache.org
Subject: Re: [Lucene.Net] Roadmap
   
While much of the content in this thread is valid and is
important, especially concerns, pain points, and
implementation details... we've gotten way off topic.
   
road map != implementation details. We should keep to a much
a higher level discussion to get this knocked out.
   
Lets outline the roadmap, put it in a wiki page.
   
Then discuss how to go about each major milestone in separate
threads to discuss implementation details. Or at least let
the people who are going to work on that particular milestone
publish their intentions to keep everyone else informed since
we're currently in a do-ocracy like state.
   
And by all means, discuss the next immediate milestones first
so people who want to dive into that can proceed.
   
So what are the next two major milestones? And from a higher
level perspective what are the major items that deem those
milestones complete?
   
What would be the the next 3 ideal milestones after the first
two? And what would be the intentions for those milestones to
accomplish?
   
- Michael
   
   
   
On Mon, Nov 21, 2011 at 7:28 PM, Christopher Currens 
currens.ch...@gmail.com wrote:
   
 Next to impossible/really, really hard. There are just some things
 that don't map quite right. Sharpen is great, but it seems
you need
 to code written in a way that makes it easily convertible,
and I don't
 see the folks at Lucene changing their coding style to do that.

 An example: 3.0.3 changes classes that inherited from
util.Parameter,
 to java enums. Java enums are more similar to classes than
they are in C#.
 They can have methods, fields, etc. I wound up converting
them into
 enums with extension methods and/or static classes (usually to
 generate the enum). The way the code was written in Java,
there's no
 way a automated tool could figure that out on its own,
unless you had
 some sort of way to tell it what to do before hand.

 I imagine porting it by hand is probably easier, though

RE: [Lucene.Net] Roadmap

2011-11-24 Thread Prescott Nasser

Welcome to SF!


 Date: Thu, 24 Nov 2011 04:05:16 -0800
 From: currens.ch...@gmail.com
 To: lucene-net-dev@lucene.apache.org
 Subject: RE: [Lucene.Net] Roadmap

 Yes, a lot of it is done. Porting highlighter is partially done, not
 committed, because it relies on the memory contrive package in java, which
 I've also ported, but the tests fail. The last contrib project I've worked
 on was snowball. If you look at the commit log, I've tried to mention what
 contrive I worked on. Those and highlighter/memory are all I've done, the
 rest is up for grabs.

 I just finished a 12 hour drive from Portland to San Francisco, so I don't
 know how legible the above is. I'll take another look at what I've done
 and what needs to be done tomorrow or so, but I think its pretty accurate.

 - Christopher
 On Nov 23, 2011 10:53 PM, Prescott Nasser geobmx...@hotmail.com wrote:

 
  Something else we need to consider is that topScore and TopScore is
  perfectly valid for a function and field name in the same class, but it
  will never be CLS compliant, and VB wouldn't work with Lucene.Net as is.
 
 
 
  
   Date: Tue, 22 Nov 2011 09:42:03 -0800
   From: currens.ch...@gmail.com
   To: lucene-net-dev@lucene.apache.org
   Subject: Re: [Lucene.Net] Roadmap
  
   Regarding the short term goals that Scott mentioned, I agree. I think
  over
   the past 9 months that we've been active, it's time we see what we need
  to
   do to graduate from the incubator. Also, 3.0.3 is actually close to a
   release, *depending* on how we feel about the Contrib libraries, which
  I'll
   discuss in a separate thread.
  
   Scott didn't mention directly, but I think it would be good to port the
  3.x
   branch past 3.0.3. Lucene has released 3.1, 3.2, 3.3, and 3.4 in addition
   to 3.0.3. Whether this means we release all those versions, or just port
   up to 3.4 and just release it, that's something we'd all have to agree
   upon. I want to get a 3.x branch up to where Java's is. Also, deciding if
   porting 4.0 can happen at the same time as 3.x is worked on and how to go
   about it, particularly how far we want to diverge from java. Either way,
  I
   think maintaining both 3.x and 4.x would be a good thing for the
  community
   to have.
  
  
   On Tue, Nov 22, 2011 at 8:56 AM, Scott Lombard lombardena...@gmail.com
  wrote:
  
Mike,
   
You're right about putting together a higher level discussion. Here are
the
road map items I see. I am interested in other have to say.
   
None of the items I have listed are contigent on the other so they can
  be
done in parallel or out of order.
   
   
1) Complete the release of 2.9.4
2) Create and release 3.0.3
   
3) Graduate from the incubator
4) Document a porting process that the community can reference.
5) Port 4.0
   
   
   
Scott
   
 -Original Message-
 From: Michael Herndon [mailto:mhern...@wickedsoftware.net]
 Sent: Tuesday, November 22, 2011 10:28 AM
 To: lucene-net-dev@lucene.apache.org
 Subject: Re: [Lucene.Net] Roadmap

 While much of the content in this thread is valid and is
 important, especially concerns, pain points, and
 implementation details... we've gotten way off topic.

 road map != implementation details. We should keep to a much
 a higher level discussion to get this knocked out.

 Lets outline the roadmap, put it in a wiki page.

 Then discuss how to go about each major milestone in separate
 threads to discuss implementation details. Or at least let
 the people who are going to work on that particular milestone
 publish their intentions to keep everyone else informed since
 we're currently in a do-ocracy like state.

 And by all means, discuss the next immediate milestones first
 so people who want to dive into that can proceed.

 So what are the next two major milestones? And from a higher
 level perspective what are the major items that deem those
 milestones complete?

 What would be the the next 3 ideal milestones after the first
 two? And what would be the intentions for those milestones to
 accomplish?

 - Michael



 On Mon, Nov 21, 2011 at 7:28 PM, Christopher Currens 
 currens.ch...@gmail.com wrote:

  Next to impossible/really, really hard. There are just some things
  that don't map quite right. Sharpen is great, but it seems
 you need
  to code written in a way that makes it easily convertible,
 and I don't
  see the folks at Lucene changing their coding style to do that.
 
  An example: 3.0.3 changes classes that inherited from
 util.Parameter,
  to java enums. Java enums are more similar to classes than
 they are in C#.
  They can have methods, fields, etc. I wound up converting
 them into
  enums with extension

Re: [Lucene.Net] Roadmap

2011-11-24 Thread Christopher Currens
Well, I'm technically in Berkeley  I'm hoping it gets sunny soon, though.
 I guess I can't complain though, it's warmer here and its not raining like
it is in Portland.:)

So, for the Contrib section, I've ported:

* Contrib.Analyzers
* Contrib.FastVectorHighlighter
* Contrib.Queries
* Contrib.Regex (there's an issue with one of the tests, it's been marked
as ignored, has to do with a differences in the regex engines)
* Contrib.Snowball

I updated Contrib.Core/Contrib.SimpleFacetedSearch to build, I couldn't
find anything to port for them, I think they're .NET specific.

So as a list of what needs to be done in contrib would be:

* make sure DistributedSearch builds and tests pass
* Port Similarity
* Port SpellChecker
* Port WordNet
* (optional) Port other contrib packages from java (some can't be easily
done)

For the branch as a whole, I want to implement the Dispose pattern properly
and change all classes that follow the Java iterator pattern, to
IEnumerable/IEnumerators.  The code would still be easy to port even after
these changes and it would be a big step in making the project fit in
better with everyday .NET development.  As it is, I've been using Extension
methods to convert a TermEnum to an actual enumerator, which is just a
wrapper class that implments IEnumerableTerm, but it's a huge pain and
really, probably shouldn't have been implemented as an exact port to begin
with.  Either way, I'd like that to be changed.

I also agree that getting the library to be CLS compliant is a good goal,
but only in terms of naming.  I don't think the rest of it is important, at
least at this point.  Off the top of my head, besides the example you
mentioned, ScoreDocs has a obsoleted public field topDocs and public
property TopDocs.

I supposed to be on vacation, so I'm trying to keep work I do to a minimum.
:)  If you want to make JIRA issues for this you can, otherwise I will do
it when I get back on Monday.


Thanks,
Christopher

On Thu, Nov 24, 2011 at 11:05 AM, Prescott Nasser geobmx...@hotmail.comwrote:


 Welcome to SF!

 
  Date: Thu, 24 Nov 2011 04:05:16 -0800
  From: currens.ch...@gmail.com
  To: lucene-net-dev@lucene.apache.org
  Subject: RE: [Lucene.Net] Roadmap
 
  Yes, a lot of it is done. Porting highlighter is partially done, not
  committed, because it relies on the memory contrive package in java,
 which
  I've also ported, but the tests fail. The last contrib project I've
 worked
  on was snowball. If you look at the commit log, I've tried to mention
 what
  contrive I worked on. Those and highlighter/memory are all I've done, the
  rest is up for grabs.
 
  I just finished a 12 hour drive from Portland to San Francisco, so I
 don't
  know how legible the above is. I'll take another look at what I've done
  and what needs to be done tomorrow or so, but I think its pretty
 accurate.
 
  - Christopher
  On Nov 23, 2011 10:53 PM, Prescott Nasser geobmx...@hotmail.com
 wrote:
 
  
   Something else we need to consider is that topScore and TopScore is
   perfectly valid for a function and field name in the same class, but it
   will never be CLS compliant, and VB wouldn't work with Lucene.Net as
 is.
  
  
  
   
Date: Tue, 22 Nov 2011 09:42:03 -0800
From: currens.ch...@gmail.com
To: lucene-net-dev@lucene.apache.org
Subject: Re: [Lucene.Net] Roadmap
   
Regarding the short term goals that Scott mentioned, I agree. I think
   over
the past 9 months that we've been active, it's time we see what we
 need
   to
do to graduate from the incubator. Also, 3.0.3 is actually close to a
release, *depending* on how we feel about the Contrib libraries,
 which
   I'll
discuss in a separate thread.
   
Scott didn't mention directly, but I think it would be good to port
 the
   3.x
branch past 3.0.3. Lucene has released 3.1, 3.2, 3.3, and 3.4 in
 addition
to 3.0.3. Whether this means we release all those versions, or just
 port
up to 3.4 and just release it, that's something we'd all have to
 agree
upon. I want to get a 3.x branch up to where Java's is. Also,
 deciding if
porting 4.0 can happen at the same time as 3.x is worked on and how
 to go
about it, particularly how far we want to diverge from java. Either
 way,
   I
think maintaining both 3.x and 4.x would be a good thing for the
   community
to have.
   
   
On Tue, Nov 22, 2011 at 8:56 AM, Scott Lombard 
 lombardena...@gmail.com
   wrote:
   
 Mike,

 You're right about putting together a higher level discussion.
 Here are
 the
 road map items I see. I am interested in other have to say.

 None of the items I have listed are contigent on the other so they
 can
   be
 done in parallel or out of order.


 1) Complete the release of 2.9.4
 2) Create and release 3.0.3

 3) Graduate from the incubator
 4) Document a porting process

Re: [Lucene.Net] Roadmap

2011-11-22 Thread Michael Herndon
While much of the content in this thread is valid and is important,
especially concerns, pain points, and implementation details... we've
gotten way off topic.

road map != implementation details. We should keep to a much a higher
level discussion to get this knocked out.

Lets outline the roadmap, put it in a wiki page.

Then discuss how to go about each major milestone in separate threads to
discuss implementation details. Or at least let the people who are going to
work on that particular milestone publish their intentions to keep everyone
else informed since we're currently in a do-ocracy like state.

And by all means, discuss the next immediate milestones first so people who
want to dive into that can proceed.

So what are the next two major milestones?  And from a higher level
perspective what are the major items that deem those milestones complete?

What would be the the next 3 ideal milestones after the first two? And what
would be the intentions for those milestones to accomplish?

- Michael



On Mon, Nov 21, 2011 at 7:28 PM, Christopher Currens 
currens.ch...@gmail.com wrote:

 Next to impossible/really, really hard.  There are just some things that
 don't map quite right.  Sharpen is great, but it seems you need to code
 written in a way that makes it easily convertible, and I don't see the
 folks at Lucene changing their coding style to do that.

 An example: 3.0.3 changes classes that inherited from util.Parameter, to
 java enums.  Java enums are more similar to classes than they are in C#.
  They can have methods, fields, etc.  I wound up converting them into enums
 with extension methods and/or static classes (usually to generate the
 enum).  The way the code was written in Java, there's no way a automated
 tool could figure that out on its own, unless you had some sort of way to
 tell it what to do before hand.

 I imagine porting it by hand is probably easier, though it would be nice if
 there was a tool that would at least convert the syntax from Java to C#, as
 well as changing the naming scheme to a .NET compatible one.  However, that
 only really helps if you're porting classes from scratch.  It could, also,
 hide bugs, since it's possible, however unlikely, something could port
 perfectly, but not behave the same way.

 A class that has many calls to string.Substring is a good example of this.
  If the name of the function is changed to the .Net version (.substring to
 .Substring), it would compile no problems, but they are very different.
  C#'s signatures is Substring(int start, int count) while Java's is
 Substring(int startIndex, int endIndex).  It may work hiding issues, it may
 throw an exception, depending on the data.  A porting tool would probably
 know many of the differences like this, so it's sorta a moot point, in that
 this relies on the skills of the developer anyway.

 I may be wrong, but I just don't see this being a fully automated process
 ever.  I would love to have something automated that at least fixed syntax
 errors, though this would only work on a line-by-line port.  (Slightly off
 topic, I think we should always have a line-by-line port, even if our
 primary goals become focusing on a fully .Net style port)  Either way, any
 sort of manual or partly-automated process would still require a lot of
 work to make sure things are ported correctly.  I also think it's most
 manageable if it were a tool that did it on a file per file basis (instead
 of project level like Sharpen), for easy review and testing.


 Thanks,
 Christopher

 On Mon, Nov 21, 2011 at 3:30 PM, Scott Lombard lombardena...@gmail.com
 wrote:

  Chris,
 
  Now that you have spent some time dealing with the porting what is your
  view
  on creating a fully automated porting tool?
 
  Scott
 
   -Original Message-
   From: Christopher Currens [mailto:currens.ch...@gmail.com]
   Sent: Monday, November 21, 2011 5:23 PM
   To: lucene-net-dev@lucene.apache.org
   Subject: Re: [Lucene.Net] Roadmap
  
   Digy,
  
   No worries.  I wasn't taking them personally.  You've been
   doing this for a lot longer than I have, but I didn't
   understand you pain until I had to go through it personally. :P
  
   Have you looked at Contrib in a while?  There's a lot of
   projects that are in Java's Contrib that are not in
   Lucene.Net?  Is this because there are some that can't easily
   (if at all) be ported over to .NET or just because they've
   been neglected?  I'm trying to get a handle on what's
   important to port and what isn't.  Figured someone with
   experience could help me with a starting point over deciding
   where to start with everything that's missing.
  
  
   Thanks,
   Christopher
  
   On Mon, Nov 21, 2011 at 2:13 PM, Digy digyd...@gmail.com wrote:
  
   
Chris,
   
Sorry, if you took my comments about pain of porting personally.
That wasn't my intension.
   
+1 for all your changes/divergences. I made/could have made
   them too.
   
DIGY
   
-Original

Re: [Lucene.Net] Roadmap

2011-11-22 Thread Michael Herndon
other possible goals (these can be assigned to any milestone and pushed
back when needed)

*  Enhanced BCL support - interfaces, operator overloads, etc.
*  FxCop support - fix or suppress messages
*  Investigate CLSCompliance  - already partially done by prescott and many
thanks for that.
*  Enhance developer experience - provide updated demos and common
out-of-the box scenarios. i.e. use Attributes  Types to index POCOs, Auto
Complete (jquery/jquery ui, etc), basic website/app search.


Other thoughts. Maybe we should stagger major releases a bit more in order
to have smaller more manageable milestones and get bits into the hands of
people sooner and thus not putting so much pressure on major releases to
get it right the first time.

Work on core, release core as a CTP.  Work on bugs from CTP  Contrib,
release second 2nd CTP, then release Beta, Then RTW.


As for 4x
I'd suggest everyone look the Lucene trunk to gauge the work required and
new changes. A release by next summer would be too agressive even if it was
scaled to just simply porting without thinking about it.

Also 4x would be the best time to make breaking changes since the Java
version breaks changes significantly. This is the one version I would not
want to just get out the door in order to maintain parity with Java. I also
hope passing out of incubation would not depend on this version's release.





On Tue, Nov 22, 2011 at 12:44 PM, Prescott Nasser geobmx...@hotmail.comwrote:

 My goal is to release 2.9.4 this month - it looks like we have no -1s from
 our dev list so in the next day or so I will put that to the general
 incubator list.

 I'd then like to release 2.9.4g the first or second week of January.

 My thoughts would be to try and have 3.0.3 ready by march and 4.0 in the
 middle of the year. Im not sure how aggressive people think that is.

 With the 4.0 release we should be at or near parity with Java and ready to
 roll out of the incubator. We could do the 4 release and the graduation
 process at the same time as well

 Sent from my Windows Phone
 
 From: Scott Lombard
 Sent: 11/22/2011 8:56 AM
 To: lucene-net-dev@lucene.apache.org
 Subject: RE: [Lucene.Net] Roadmap

 Mike,

 You're right about putting together a higher level discussion.  Here are
 the
 road map items I see.  I am interested in other have to say.

 None of the items I have listed are contigent on the other so they can be
 done in parallel or out of order.


 1) Complete the release of 2.9.4
 2) Create and release 3.0.3

 3) Graduate from the incubator
 4) Document a porting process that the community can reference.
 5) Port 4.0



 Scott

  -Original Message-
  From: Michael Herndon [mailto:mhern...@wickedsoftware.net]
  Sent: Tuesday, November 22, 2011 10:28 AM
  To: lucene-net-dev@lucene.apache.org
  Subject: Re: [Lucene.Net] Roadmap
 
  While much of the content in this thread is valid and is
  important, especially concerns, pain points, and
  implementation details... we've gotten way off topic.
 
  road map != implementation details. We should keep to a much
  a higher level discussion to get this knocked out.
 
  Lets outline the roadmap, put it in a wiki page.
 
  Then discuss how to go about each major milestone in separate
  threads to discuss implementation details. Or at least let
  the people who are going to work on that particular milestone
  publish their intentions to keep everyone else informed since
  we're currently in a do-ocracy like state.
 
  And by all means, discuss the next immediate milestones first
  so people who want to dive into that can proceed.
 
  So what are the next two major milestones?  And from a higher
  level perspective what are the major items that deem those
  milestones complete?
 
  What would be the the next 3 ideal milestones after the first
  two? And what would be the intentions for those milestones to
  accomplish?
 
  - Michael
 
 
 
  On Mon, Nov 21, 2011 at 7:28 PM, Christopher Currens 
  currens.ch...@gmail.com wrote:
 
   Next to impossible/really, really hard.  There are just some things
   that don't map quite right.  Sharpen is great, but it seems
  you need
   to code written in a way that makes it easily convertible,
  and I don't
   see the folks at Lucene changing their coding style to do that.
  
   An example: 3.0.3 changes classes that inherited from
  util.Parameter,
   to java enums.  Java enums are more similar to classes than
  they are in C#.
They can have methods, fields, etc.  I wound up converting
  them into
   enums with extension methods and/or static classes (usually to
   generate the enum).  The way the code was written in Java,
  there's no
   way a automated tool could figure that out on its own,
  unless you had
   some sort of way to tell it what to do before hand.
  
   I imagine porting it by hand is probably easier, though it would be
   nice if there was a tool that would at least convert the
  syntax from
   Java to C#, as well

Re: [Lucene.Net] Roadmap

2011-11-22 Thread Stefan Bodewig
On 2011-11-22, Prescott Nasser wrote:

 My goal is to release 2.9.4 this month - it looks like we have no -1s
 from our dev list so in the next day or so I will put that to the
 general incubator list.

Please remember to mention you need two more IPMC member votes.

 With the 4.0 release we should be at or near parity with Java and
 ready to roll out of the incubator. We could do the 4 release and the
 graduation process at the same time as well

For my personal taste this is way too late.  There is no reason why
you'd have to be on par with Java Lucene in order to leave the
incubator.

Stefan


RE: [Lucene.Net] Roadmap

2011-11-22 Thread Prescott Nasser

  My goal is to release 2.9.4 this month - it looks like we have no -1s
  from our dev list so in the next day or so I will put that to the
  general incubator list.

 Please remember to mention you need two more IPMC member votes.

 

I just sent that vote email out - do you want me to update? or do you mind just 
tossing it a +1?

 



  With the 4.0 release we should be at or near parity with Java and
  ready to roll out of the incubator. We could do the 4 release and the
  graduation process at the same time as well

 For my personal taste this is way too late. There is no reason why
 you'd have to be on par with Java Lucene in order to leave the
 incubator.


 

No problem by me, just from what I saw there are a number of podlings that are 
in the incubator for a long time. I figured no harm in staying in. But also, it 
seems like first half of next year is too quick (from those who have already 
started working on it)

 

~P

Re: [Lucene.Net] Roadmap

2011-11-22 Thread Stefan Bodewig
On 2011-11-23, Prescott Nasser wrote:

 My goal is to release 2.9.4 this month - it looks like we have no -1s
 from our dev list so in the next day or so I will put that to the
 general incubator list.

 Please remember to mention you need two more IPMC member votes.

 I just sent that vote email out - do you want me to update? or do you
 mind just tossing it a +1?

I will follow up to it.

 With the 4.0 release we should be at or near parity with Java and
 ready to roll out of the incubator. We could do the 4 release and the
 graduation process at the same time as well

 For my personal taste this is way too late. There is no reason why
 you'd have to be on par with Java Lucene in order to leave the
 incubator.

 No problem by me, just from what I saw there are a number of podlings
 that are in the incubator for a long time. I figured no harm in
 staying in. But also, it seems like first half of next year is too
 quick (from those who have already started working on it)

Oh, I didn't mean release 4.x earlier, I meant graduate earlier.  There
is no reason to defer graduation until 4.x is ready.

Stefan


Re: [Lucene.Net] Roadmap

2011-11-22 Thread Michael Herndon


 In the same vain: sort out the build process in a way that doesn't
 require the tools to be checked into svn.


https://cwiki.apache.org/confluence/display/LUCENENET/Road+Map

Posting these under floating goals on wiki. I already have some ideas on
how this can be done, so I've linked the above goal to a planning page for
the build.  We'll revisit once the other major items are accomplished on
the planning page.

Edit as necessary =)

- Michael


RE: [Lucene.Net] Roadmap

2011-11-22 Thread Scott Lombard
Mike,

You're right about putting together a higher level discussion.  Here are the
road map items I see.  I am interested in other have to say.

None of the items I have listed are contigent on the other so they can be
done in parallel or out of order.  


1) Complete the release of 2.9.4
2) Create and release 3.0.3

3) Graduate from the incubator
4) Document a porting process that the community can reference.
5) Port 4.0



Scott  

 -Original Message-
 From: Michael Herndon [mailto:mhern...@wickedsoftware.net] 
 Sent: Tuesday, November 22, 2011 10:28 AM
 To: lucene-net-...@lucene.apache.org
 Subject: Re: [Lucene.Net] Roadmap
 
 While much of the content in this thread is valid and is 
 important, especially concerns, pain points, and 
 implementation details... we've gotten way off topic.
 
 road map != implementation details. We should keep to a much 
 a higher level discussion to get this knocked out.
 
 Lets outline the roadmap, put it in a wiki page.
 
 Then discuss how to go about each major milestone in separate 
 threads to discuss implementation details. Or at least let 
 the people who are going to work on that particular milestone 
 publish their intentions to keep everyone else informed since 
 we're currently in a do-ocracy like state.
 
 And by all means, discuss the next immediate milestones first 
 so people who want to dive into that can proceed.
 
 So what are the next two major milestones?  And from a higher 
 level perspective what are the major items that deem those 
 milestones complete?
 
 What would be the the next 3 ideal milestones after the first 
 two? And what would be the intentions for those milestones to 
 accomplish?
 
 - Michael
 
 
 
 On Mon, Nov 21, 2011 at 7:28 PM, Christopher Currens  
 currens.ch...@gmail.com wrote:
 
  Next to impossible/really, really hard.  There are just some things 
  that don't map quite right.  Sharpen is great, but it seems 
 you need 
  to code written in a way that makes it easily convertible, 
 and I don't 
  see the folks at Lucene changing their coding style to do that.
 
  An example: 3.0.3 changes classes that inherited from 
 util.Parameter, 
  to java enums.  Java enums are more similar to classes than 
 they are in C#.
   They can have methods, fields, etc.  I wound up converting 
 them into 
  enums with extension methods and/or static classes (usually to 
  generate the enum).  The way the code was written in Java, 
 there's no 
  way a automated tool could figure that out on its own, 
 unless you had 
  some sort of way to tell it what to do before hand.
 
  I imagine porting it by hand is probably easier, though it would be 
  nice if there was a tool that would at least convert the 
 syntax from 
  Java to C#, as well as changing the naming scheme to a .NET 
 compatible 
  one.  However, that only really helps if you're porting 
 classes from 
  scratch.  It could, also, hide bugs, since it's possible, however 
  unlikely, something could port perfectly, but not behave 
 the same way.
 
  A class that has many calls to string.Substring is a good 
 example of this.
   If the name of the function is changed to the .Net version 
  (.substring to .Substring), it would compile no problems, 
 but they are very different.
   C#'s signatures is Substring(int start, int count) while Java's is 
  Substring(int startIndex, int endIndex).  It may work 
 hiding issues, 
  it may throw an exception, depending on the data.  A porting tool 
  would probably know many of the differences like this, so 
 it's sorta a 
  moot point, in that this relies on the skills of the 
 developer anyway.
 
  I may be wrong, but I just don't see this being a fully automated 
  process ever.  I would love to have something automated 
 that at least 
  fixed syntax errors, though this would only work on a line-by-line 
  port.  (Slightly off topic, I think we should always have a 
  line-by-line port, even if our primary goals become focusing on a 
  fully .Net style port)  Either way, any sort of manual or 
  partly-automated process would still require a lot of work to make 
  sure things are ported correctly.  I also think it's most 
 manageable 
  if it were a tool that did it on a file per file basis 
 (instead of project level like Sharpen), for easy review and testing.
 
 
  Thanks,
  Christopher
 
  On Mon, Nov 21, 2011 at 3:30 PM, Scott Lombard 
  lombardena...@gmail.com
  wrote:
 
   Chris,
  
   Now that you have spent some time dealing with the 
 porting what is 
   your view on creating a fully automated porting tool?
  
   Scott
  
-Original Message-
From: Christopher Currens [mailto:currens.ch...@gmail.com]
Sent: Monday, November 21, 2011 5:23 PM
To: lucene-net-...@lucene.apache.org
Subject: Re: [Lucene.Net] Roadmap
   
Digy,
   
No worries.  I wasn't taking them personally.  You've 
 been doing 
this for a lot longer than I have, but I didn't understand you 
pain until I had to go through

Re: [Lucene.Net] Roadmap

2011-11-22 Thread Christopher Currens
Regarding the short term goals that Scott mentioned, I agree.  I think over
the past 9 months that we've been active, it's time we see what we need to
do to graduate from the incubator.  Also, 3.0.3 is actually close to a
release, *depending* on how we feel about the Contrib libraries, which I'll
discuss in a separate thread.

Scott didn't mention directly, but I think it would be good to port the 3.x
branch past 3.0.3.  Lucene has released 3.1, 3.2, 3.3, and 3.4 in addition
to 3.0.3.  Whether this means we release all those versions, or just port
up to 3.4 and just release it, that's something we'd all have to agree
upon.  I want to get a 3.x branch up to where Java's is.  Also, deciding if
porting 4.0 can happen at the same time as 3.x is worked on and how to go
about it, particularly how far we want to diverge from java.  Either way, I
think maintaining both 3.x and 4.x would be a good thing for the community
to have.


On Tue, Nov 22, 2011 at 8:56 AM, Scott Lombard lombardena...@gmail.comwrote:

 Mike,

 You're right about putting together a higher level discussion.  Here are
 the
 road map items I see.  I am interested in other have to say.

 None of the items I have listed are contigent on the other so they can be
 done in parallel or out of order.


 1) Complete the release of 2.9.4
 2) Create and release 3.0.3

 3) Graduate from the incubator
 4) Document a porting process that the community can reference.
 5) Port 4.0



 Scott

  -Original Message-
  From: Michael Herndon [mailto:mhern...@wickedsoftware.net]
  Sent: Tuesday, November 22, 2011 10:28 AM
  To: lucene-net-...@lucene.apache.org
  Subject: Re: [Lucene.Net] Roadmap
 
  While much of the content in this thread is valid and is
  important, especially concerns, pain points, and
  implementation details... we've gotten way off topic.
 
  road map != implementation details. We should keep to a much
  a higher level discussion to get this knocked out.
 
  Lets outline the roadmap, put it in a wiki page.
 
  Then discuss how to go about each major milestone in separate
  threads to discuss implementation details. Or at least let
  the people who are going to work on that particular milestone
  publish their intentions to keep everyone else informed since
  we're currently in a do-ocracy like state.
 
  And by all means, discuss the next immediate milestones first
  so people who want to dive into that can proceed.
 
  So what are the next two major milestones?  And from a higher
  level perspective what are the major items that deem those
  milestones complete?
 
  What would be the the next 3 ideal milestones after the first
  two? And what would be the intentions for those milestones to
  accomplish?
 
  - Michael
 
 
 
  On Mon, Nov 21, 2011 at 7:28 PM, Christopher Currens 
  currens.ch...@gmail.com wrote:
 
   Next to impossible/really, really hard.  There are just some things
   that don't map quite right.  Sharpen is great, but it seems
  you need
   to code written in a way that makes it easily convertible,
  and I don't
   see the folks at Lucene changing their coding style to do that.
  
   An example: 3.0.3 changes classes that inherited from
  util.Parameter,
   to java enums.  Java enums are more similar to classes than
  they are in C#.
They can have methods, fields, etc.  I wound up converting
  them into
   enums with extension methods and/or static classes (usually to
   generate the enum).  The way the code was written in Java,
  there's no
   way a automated tool could figure that out on its own,
  unless you had
   some sort of way to tell it what to do before hand.
  
   I imagine porting it by hand is probably easier, though it would be
   nice if there was a tool that would at least convert the
  syntax from
   Java to C#, as well as changing the naming scheme to a .NET
  compatible
   one.  However, that only really helps if you're porting
  classes from
   scratch.  It could, also, hide bugs, since it's possible, however
   unlikely, something could port perfectly, but not behave
  the same way.
  
   A class that has many calls to string.Substring is a good
  example of this.
If the name of the function is changed to the .Net version
   (.substring to .Substring), it would compile no problems,
  but they are very different.
C#'s signatures is Substring(int start, int count) while Java's is
   Substring(int startIndex, int endIndex).  It may work
  hiding issues,
   it may throw an exception, depending on the data.  A porting tool
   would probably know many of the differences like this, so
  it's sorta a
   moot point, in that this relies on the skills of the
  developer anyway.
  
   I may be wrong, but I just don't see this being a fully automated
   process ever.  I would love to have something automated
  that at least
   fixed syntax errors, though this would only work on a line-by-line
   port.  (Slightly off topic, I think we should always have a
   line-by-line port, even if our primary

RE: [Lucene.Net] Roadmap

2011-11-22 Thread Prescott Nasser
My goal is to release 2.9.4 this month - it looks like we have no -1s from our 
dev list so in the next day or so I will put that to the general incubator list.

I'd then like to release 2.9.4g the first or second week of January.

My thoughts would be to try and have 3.0.3 ready by march and 4.0 in the middle 
of the year. Im not sure how aggressive people think that is.

With the 4.0 release we should be at or near parity with Java and ready to roll 
out of the incubator. We could do the 4 release and the graduation process at 
the same time as well

Sent from my Windows Phone

From: Scott Lombard
Sent: 11/22/2011 8:56 AM
To: lucene-net-...@lucene.apache.org
Subject: RE: [Lucene.Net] Roadmap

Mike,

You're right about putting together a higher level discussion.  Here are the
road map items I see.  I am interested in other have to say.

None of the items I have listed are contigent on the other so they can be
done in parallel or out of order.


1) Complete the release of 2.9.4
2) Create and release 3.0.3

3) Graduate from the incubator
4) Document a porting process that the community can reference.
5) Port 4.0



Scott

 -Original Message-
 From: Michael Herndon [mailto:mhern...@wickedsoftware.net]
 Sent: Tuesday, November 22, 2011 10:28 AM
 To: lucene-net-...@lucene.apache.org
 Subject: Re: [Lucene.Net] Roadmap

 While much of the content in this thread is valid and is
 important, especially concerns, pain points, and
 implementation details... we've gotten way off topic.

 road map != implementation details. We should keep to a much
 a higher level discussion to get this knocked out.

 Lets outline the roadmap, put it in a wiki page.

 Then discuss how to go about each major milestone in separate
 threads to discuss implementation details. Or at least let
 the people who are going to work on that particular milestone
 publish their intentions to keep everyone else informed since
 we're currently in a do-ocracy like state.

 And by all means, discuss the next immediate milestones first
 so people who want to dive into that can proceed.

 So what are the next two major milestones?  And from a higher
 level perspective what are the major items that deem those
 milestones complete?

 What would be the the next 3 ideal milestones after the first
 two? And what would be the intentions for those milestones to
 accomplish?

 - Michael



 On Mon, Nov 21, 2011 at 7:28 PM, Christopher Currens 
 currens.ch...@gmail.com wrote:

  Next to impossible/really, really hard.  There are just some things
  that don't map quite right.  Sharpen is great, but it seems
 you need
  to code written in a way that makes it easily convertible,
 and I don't
  see the folks at Lucene changing their coding style to do that.
 
  An example: 3.0.3 changes classes that inherited from
 util.Parameter,
  to java enums.  Java enums are more similar to classes than
 they are in C#.
   They can have methods, fields, etc.  I wound up converting
 them into
  enums with extension methods and/or static classes (usually to
  generate the enum).  The way the code was written in Java,
 there's no
  way a automated tool could figure that out on its own,
 unless you had
  some sort of way to tell it what to do before hand.
 
  I imagine porting it by hand is probably easier, though it would be
  nice if there was a tool that would at least convert the
 syntax from
  Java to C#, as well as changing the naming scheme to a .NET
 compatible
  one.  However, that only really helps if you're porting
 classes from
  scratch.  It could, also, hide bugs, since it's possible, however
  unlikely, something could port perfectly, but not behave
 the same way.
 
  A class that has many calls to string.Substring is a good
 example of this.
   If the name of the function is changed to the .Net version
  (.substring to .Substring), it would compile no problems,
 but they are very different.
   C#'s signatures is Substring(int start, int count) while Java's is
  Substring(int startIndex, int endIndex).  It may work
 hiding issues,
  it may throw an exception, depending on the data.  A porting tool
  would probably know many of the differences like this, so
 it's sorta a
  moot point, in that this relies on the skills of the
 developer anyway.
 
  I may be wrong, but I just don't see this being a fully automated
  process ever.  I would love to have something automated
 that at least
  fixed syntax errors, though this would only work on a line-by-line
  port.  (Slightly off topic, I think we should always have a
  line-by-line port, even if our primary goals become focusing on a
  fully .Net style port)  Either way, any sort of manual or
  partly-automated process would still require a lot of work to make
  sure things are ported correctly.  I also think it's most
 manageable
  if it were a tool that did it on a file per file basis
 (instead of project level like Sharpen), for easy review and testing.
 
 
  Thanks,
  Christopher

RE: [Lucene.Net] Roadmap

2011-11-21 Thread Digy

Chris,

Sorry, if you took my comments about pain of porting personally. That
wasn't my intension.

+1 for all your changes/divergences. I made/could have made them too.

DIGY

-Original Message-
From: Christopher Currens [mailto:currens.ch...@gmail.com] 
Sent: Monday, November 21, 2011 11:45 PM
To: lucene-net-dev@lucene.apache.org
Subject: Re: [Lucene.Net] Roadmap

Digy,

I used 2.9.4 trunk as the base for the 3.0.3 branch, but I looked to the
code in 2.9.4g as a reference for many things, particularly the Support
classes.  We hit many of the same issues I'm sure, I moved some of the
anonymous classes into a base class where you could inject functions,
though not all could be replaced, nor did I replace all that could have
been.  Some of our code is different, I went for the option for
WeakDictionary to be completely generic, as in wrapping a generic
dictionary with WeakKeyT instead of wrapping the already existing
WeakHashTable in support.  In hindsight, it may have just been easier to
convert the WeakHashTable to generic, but alas, I'm only realizing that
now.  There is a problem with my WeakDictionary, specifically the function
that determines when to clean/compact the dictionary and remove the dead
keys.  I need a better heuristic of deciding when to run the clean.  That's
a performance issue though.

Regarding the pain of porting, I am a changed man.  It's nice, in a sad
way, to know that I'm not the only one who experienced those difficulties.
 I used to be in the camp that porting code that differed from java
wouldn't be difficult at all.  However, now I code corrected!  It threw me
a curve-ball, for sure.  I DO think a line-by-line port can definitely
include the things talked about below, ie the changes to Dispose and the
changes to IEnumerableT.  Those changes, I thing, can be made without a
heavy impact on the porting process.

There was one fairly large change I opted to use that differed quite a bit
from Java, however, and that was the use of the TPL in
ParallelMultiSearcher.  It was far easier to port this way, and I don't
think it affects the porting process too much.  Java uses a helper class
defined at the bottom of the source file that handles it, I'm simply using
a built-in one instead.  I just need to be careful about it, it would be
really easy to get carried away with it.


Thanks,
Christopher

On Mon, Nov 21, 2011 at 1:20 PM, Digy digyd...@gmail.com wrote:

 Hi Chris,

 First of all, thank you for your great work on 3.0.3 branch.
 I suppose you took 2.9.4 as a code base to make 3.0.3 port since some of
 your problems are the same with those I faced in 2.9.4g branch.
 (e.g,
Support/MemoryMappedDirectory.cs (but never used in core),
IDisposable,
introduction of some ActionTs, FuncTs ,
foreach instead of GetEnumerator/MoveNext,
IEquatableT,
WeakDictionaryT,
SetT
etc.
 )

 Since I also used 3.0.3 as a reference, maybe we can use some of 2.9.4g's
 code in 3.0.3 when necessary(I haven't had time to look into 3.0.3 deeply)

 Just to ensure the coordination, maybe you should create a new issue in
 JIRA, so that people send patches to that issue instead of directly
 commiting.


 @Prescott,
 2.9.4g is not behind of 2.9.4 in bug fixes  features level. So, It is (I
 think) ready for another release.(I use it in all my projects since long).


 PS: Hearing the pain of porting codes that greatly differ from Java made
 me just smile( sorry for that:( ). Be ready for responses that get beyond
 the criticism between With all due respect  Just my $0.02
paranthesis.

 DIGY

 -Original Message-
 From: Christopher Currens [mailto:currens.ch...@gmail.com]
 Sent: Monday, November 21, 2011 10:19 PM
 To: lucene-net-dev@lucene.apache.org; casper...@caspershouse.com
 Subject: Re: [Lucene.Net] Roadmap

 Some of the Lucene classes have Dispose methods, well, ones that call
Close
 (and that Close method may or may not call base.Close(), if needed or
not).
  Virtual dispose methods can be dangerous only in that they're easy to
 implement wrong.  However, it shouldn't be too bad, at least with a
 line-by-line port, as we would make the call to the base class whenever
 Lucene does, and that would (should) give us the same behavior,
implemented
 properly.  I'm not aware of differences in the JVM, regarding inheritance
 and base methods being called automatically, particularly Close methods.

 Slightly unrelated, another annoyance is the use of Java Iterators vs C#
 Enumerables.  A lot of our code is there simply because there are
 Iterators, but it could be converted to Enumerables. The whole HasNext,
 Next vs C#'s MoveNext(), Current is annoying, but it's used all over in
the
 base code, and would have to be changed there as well.  Either way, I
would
 like to push for that before 3.0.3 is relased.  IMO, small changes like
 this still keep the code similar to the line-by-line port, in that it
 doesn't add any difficulties

Re: [Lucene.Net] Roadmap

2011-11-21 Thread Christopher Currens
Digy,

No worries.  I wasn't taking them personally.  You've been doing this for a
lot longer than I have, but I didn't understand you pain until I had to go
through it personally. :P

Have you looked at Contrib in a while?  There's a lot of projects that are
in Java's Contrib that are not in Lucene.Net?  Is this because there are
some that can't easily (if at all) be ported over to .NET or just because
they've been neglected?  I'm trying to get a handle on what's important to
port and what isn't.  Figured someone with experience could help me with a
starting point over deciding where to start with everything that's missing.


Thanks,
Christopher

On Mon, Nov 21, 2011 at 2:13 PM, Digy digyd...@gmail.com wrote:


 Chris,

 Sorry, if you took my comments about pain of porting personally. That
 wasn't my intension.

 +1 for all your changes/divergences. I made/could have made them too.

 DIGY

 -Original Message-
 From: Christopher Currens [mailto:currens.ch...@gmail.com]
 Sent: Monday, November 21, 2011 11:45 PM
 To: lucene-net-dev@lucene.apache.org
 Subject: Re: [Lucene.Net] Roadmap

 Digy,

 I used 2.9.4 trunk as the base for the 3.0.3 branch, but I looked to the
 code in 2.9.4g as a reference for many things, particularly the Support
 classes.  We hit many of the same issues I'm sure, I moved some of the
 anonymous classes into a base class where you could inject functions,
 though not all could be replaced, nor did I replace all that could have
 been.  Some of our code is different, I went for the option for
 WeakDictionary to be completely generic, as in wrapping a generic
 dictionary with WeakKeyT instead of wrapping the already existing
 WeakHashTable in support.  In hindsight, it may have just been easier to
 convert the WeakHashTable to generic, but alas, I'm only realizing that
 now.  There is a problem with my WeakDictionary, specifically the function
 that determines when to clean/compact the dictionary and remove the dead
 keys.  I need a better heuristic of deciding when to run the clean.  That's
 a performance issue though.

 Regarding the pain of porting, I am a changed man.  It's nice, in a sad
 way, to know that I'm not the only one who experienced those difficulties.
  I used to be in the camp that porting code that differed from java
 wouldn't be difficult at all.  However, now I code corrected!  It threw me
 a curve-ball, for sure.  I DO think a line-by-line port can definitely
 include the things talked about below, ie the changes to Dispose and the
 changes to IEnumerableT.  Those changes, I thing, can be made without a
 heavy impact on the porting process.

 There was one fairly large change I opted to use that differed quite a bit
 from Java, however, and that was the use of the TPL in
 ParallelMultiSearcher.  It was far easier to port this way, and I don't
 think it affects the porting process too much.  Java uses a helper class
 defined at the bottom of the source file that handles it, I'm simply using
 a built-in one instead.  I just need to be careful about it, it would be
 really easy to get carried away with it.


 Thanks,
 Christopher

 On Mon, Nov 21, 2011 at 1:20 PM, Digy digyd...@gmail.com wrote:

  Hi Chris,
 
  First of all, thank you for your great work on 3.0.3 branch.
  I suppose you took 2.9.4 as a code base to make 3.0.3 port since some of
  your problems are the same with those I faced in 2.9.4g branch.
  (e.g,
 Support/MemoryMappedDirectory.cs (but never used in core),
 IDisposable,
 introduction of some ActionTs, FuncTs ,
 foreach instead of GetEnumerator/MoveNext,
 IEquatableT,
 WeakDictionaryT,
 SetT
 etc.
  )
 
  Since I also used 3.0.3 as a reference, maybe we can use some of 2.9.4g's
  code in 3.0.3 when necessary(I haven't had time to look into 3.0.3
 deeply)
 
  Just to ensure the coordination, maybe you should create a new issue in
  JIRA, so that people send patches to that issue instead of directly
  commiting.
 
 
  @Prescott,
  2.9.4g is not behind of 2.9.4 in bug fixes  features level. So, It is (I
  think) ready for another release.(I use it in all my projects since
 long).
 
 
  PS: Hearing the pain of porting codes that greatly differ from Java
 made
  me just smile( sorry for that:( ). Be ready for responses that get beyond
  the criticism between With all due respect  Just my $0.02
 paranthesis.
 
  DIGY
 
  -Original Message-
  From: Christopher Currens [mailto:currens.ch...@gmail.com]
  Sent: Monday, November 21, 2011 10:19 PM
  To: lucene-net-dev@lucene.apache.org; casper...@caspershouse.com
  Subject: Re: [Lucene.Net] Roadmap
 
  Some of the Lucene classes have Dispose methods, well, ones that call
 Close
  (and that Close method may or may not call base.Close(), if needed or
 not).
   Virtual dispose methods can be dangerous only in that they're easy to
  implement wrong.  However, it shouldn't be too bad, at least with a
  line-by-line port, as we would

RE: [Lucene.Net] Roadmap

2011-11-21 Thread Digy
My english isn't enough to understand this answer. I hope it is not related
with
employee-employer relationship as in the past.

DIGY

-Original Message-
From: Christopher Currens [mailto:currens.ch...@gmail.com] 
Sent: Tuesday, November 22, 2011 1:08 AM
To: lucene-net-dev@lucene.apache.org
Subject: Re: [Lucene.Net] Roadmap

To clarify, it wasn't as much *difficult* as it was more *painful*.  Above,
I was inferring that it was more difficult that the rest of the code, which
by comparison was easier.  It wasn't painless to try and map where code
changes were from the java classes into the .Net version.  I prefer that
style more for its readability and the niceties of working with a .Net
style of Lucene, however as I said before, it slowed down significantly the
porting process.  I hope it didn't come across that I thought that it was
bad code, because it's probably the most readable code we have in the
Contrib at the moment.

I want to make it clear that my intention right now is to get Lucene.Net up
to date with Java.  When I read the Java code, I understand its intent, and
I make sure the ported code represents it.  That takes enough time as it
is, moving to try and figure out where the code went in Lucene.Net, since
it wasn't a 1-1 map, was a MINOR annoyance, especially when you compare it
to the issues I had dealing with the differences between the two languages,
generics especialy.  That being said, I don't have a problem with code
being converted in a .Net idiomatic way, in fact, I welcome it, if it still
allows the changes to be ported with minimal effort.  I feel at this point
in the project, there are some limitations to how far I'd like it to
diverge.

Anyway, my opinion, which may not be in agreement with the group as a
whole, is that it would be better to bring the codebase up to date, or at
least more up to date with java's, and then maintaining a version with a
complete .net-concentric API.  I feel this would beeasier, as porting
Java's Lucene SVN commits by the week would be a relatively small workload.

On Mon, Nov 21, 2011 at 2:41 PM, Troy Howard thowar...@gmail.com wrote:

 So, if we're getting back to the line by line port discussion... I
 think either side of this discussion is too extreme. For the case in
 point Chris just mentioned (which I'm not really sure what part was so
 difficult, as I ported that library in about 30 minutes from
 scratch)... anything is a pain if it sticks out in the middle of doing
 something completely different.

 The only reason we are able to do this line by line is due to the
 general similarity between Java and C#'s language syntax. If we were
 porting Lucene to a completely different language, that had a totally
 different syntax, the process would go like this:

 - Look at the original code, understand it's intent
 - Create similar code in the new language that expresses the same intent

 When applying changes:

 - Look at the original code diffs, understanding the intent of the change
 - Look at the ported code, and apply the changed logic's meaning in
 that language

 So, is just a different thought process. In my opinion, it's a better
 process because it forces the developer to actually think about the
 code instead of blindly converting syntax (possibly slightly
 incorrectly and introducing regressions). While there is a large
 volume of unit tests in Lucene, they are unfortunately not really the
 right tests and make porting much more difficult, because it's hard to
 verify that your ported code behaves the same because you can't just
 rely on the unit tests to verify your port. Therefore, it's safer to
 follow a process that requires the developer to delve deeply into the
 meaning of the code. Following a line-by-line process is convenient,
 but doesn't focus on meaning, which I think is more important.

 Thanks,
 Troy

 On Mon, Nov 21, 2011 at 2:23 PM, Christopher Currens
 currens.ch...@gmail.com wrote:
  Digy,
 
  No worries.  I wasn't taking them personally.  You've been doing this
 for a
  lot longer than I have, but I didn't understand you pain until I had to
 go
  through it personally. :P
 
  Have you looked at Contrib in a while?  There's a lot of projects that
 are
  in Java's Contrib that are not in Lucene.Net?  Is this because there are
  some that can't easily (if at all) be ported over to .NET or just
because
  they've been neglected?  I'm trying to get a handle on what's important
 to
  port and what isn't.  Figured someone with experience could help me with
 a
  starting point over deciding where to start with everything that's
 missing.
 
 
  Thanks,
  Christopher
 
  On Mon, Nov 21, 2011 at 2:13 PM, Digy digyd...@gmail.com wrote:
 
 
  Chris,
 
  Sorry, if you took my comments about pain of porting personally. That
  wasn't my intension.
 
  +1 for all your changes/divergences. I made/could have made them too.
 
  DIGY
 
  -Original Message-
  From: Christopher Currens [mailto:currens.ch...@gmail.com]
  Sent

RE: [Lucene.Net] Roadmap

2011-11-21 Thread Scott Lombard
Chris,

Now that you have spent some time dealing with the porting what is your view
on creating a fully automated porting tool?  

Scott  

 -Original Message-
 From: Christopher Currens [mailto:currens.ch...@gmail.com] 
 Sent: Monday, November 21, 2011 5:23 PM
 To: lucene-net-dev@lucene.apache.org
 Subject: Re: [Lucene.Net] Roadmap
 
 Digy,
 
 No worries.  I wasn't taking them personally.  You've been 
 doing this for a lot longer than I have, but I didn't 
 understand you pain until I had to go through it personally. :P
 
 Have you looked at Contrib in a while?  There's a lot of 
 projects that are in Java's Contrib that are not in 
 Lucene.Net?  Is this because there are some that can't easily 
 (if at all) be ported over to .NET or just because they've 
 been neglected?  I'm trying to get a handle on what's 
 important to port and what isn't.  Figured someone with 
 experience could help me with a starting point over deciding 
 where to start with everything that's missing.
 
 
 Thanks,
 Christopher
 
 On Mon, Nov 21, 2011 at 2:13 PM, Digy digyd...@gmail.com wrote:
 
 
  Chris,
 
  Sorry, if you took my comments about pain of porting personally. 
  That wasn't my intension.
 
  +1 for all your changes/divergences. I made/could have made 
 them too.
 
  DIGY
 
  -Original Message-
  From: Christopher Currens [mailto:currens.ch...@gmail.com]
  Sent: Monday, November 21, 2011 11:45 PM
  To: lucene-net-dev@lucene.apache.org
  Subject: Re: [Lucene.Net] Roadmap
 
  Digy,
 
  I used 2.9.4 trunk as the base for the 3.0.3 branch, but I 
 looked to 
  the code in 2.9.4g as a reference for many things, particularly the 
  Support classes.  We hit many of the same issues I'm sure, I moved 
  some of the anonymous classes into a base class where you 
 could inject 
  functions, though not all could be replaced, nor did I replace all 
  that could have been.  Some of our code is different, I 
 went for the 
  option for WeakDictionary to be completely generic, as in 
 wrapping a 
  generic dictionary with WeakKeyT instead of wrapping the already 
  existing WeakHashTable in support.  In hindsight, it may have just 
  been easier to convert the WeakHashTable to generic, but alas, I'm 
  only realizing that now.  There is a problem with my 
 WeakDictionary, 
  specifically the function that determines when to clean/compact the 
  dictionary and remove the dead keys.  I need a better heuristic of 
  deciding when to run the clean.  That's a performance issue though.
 
  Regarding the pain of porting, I am a changed man.  It's 
 nice, in a 
  sad way, to know that I'm not the only one who experienced 
 those difficulties.
   I used to be in the camp that porting code that differed from java 
  wouldn't be difficult at all.  However, now I code corrected!  It 
  threw me a curve-ball, for sure.  I DO think a line-by-line 
 port can 
  definitely include the things talked about below, ie the changes to 
  Dispose and the changes to IEnumerableT.  Those changes, I thing, 
  can be made without a heavy impact on the porting process.
 
  There was one fairly large change I opted to use that 
 differed quite a 
  bit from Java, however, and that was the use of the TPL in 
  ParallelMultiSearcher.  It was far easier to port this way, and I 
  don't think it affects the porting process too much.  Java uses a 
  helper class defined at the bottom of the source file that 
 handles it, 
  I'm simply using a built-in one instead.  I just need to be careful 
  about it, it would be really easy to get carried away with it.
 
 
  Thanks,
  Christopher
 
  On Mon, Nov 21, 2011 at 1:20 PM, Digy digyd...@gmail.com wrote:
 
   Hi Chris,
  
   First of all, thank you for your great work on 3.0.3 branch.
   I suppose you took 2.9.4 as a code base to make 3.0.3 port since 
   some of your problems are the same with those I faced in 
 2.9.4g branch.
   (e.g,
  Support/MemoryMappedDirectory.cs (but never used in core),
  IDisposable,
  introduction of some ActionTs, FuncTs ,
  foreach instead of GetEnumerator/MoveNext,
  IEquatableT,
  WeakDictionaryT,
  SetT
  etc.
   )
  
   Since I also used 3.0.3 as a reference, maybe we can use some of 
   2.9.4g's code in 3.0.3 when necessary(I haven't had time to look 
   into 3.0.3
  deeply)
  
   Just to ensure the coordination, maybe you should create 
 a new issue 
   in JIRA, so that people send patches to that issue instead of 
   directly commiting.
  
  
   @Prescott,
   2.9.4g is not behind of 2.9.4 in bug fixes  features 
 level. So, It 
   is (I
   think) ready for another release.(I use it in all my 
 projects since
  long).
  
  
   PS: Hearing the pain of porting codes that greatly differ from 
   Java
  made
   me just smile( sorry for that:( ). Be ready for responses 
 that get 
   beyond the criticism between With all due respect  
 Just my $0.02
  paranthesis.
  
   DIGY
  
   -Original

Re: [Lucene.Net] Roadmap

2011-11-21 Thread Christopher Currens
Next to impossible/really, really hard.  There are just some things that
don't map quite right.  Sharpen is great, but it seems you need to code
written in a way that makes it easily convertible, and I don't see the
folks at Lucene changing their coding style to do that.

An example: 3.0.3 changes classes that inherited from util.Parameter, to
java enums.  Java enums are more similar to classes than they are in C#.
 They can have methods, fields, etc.  I wound up converting them into enums
with extension methods and/or static classes (usually to generate the
enum).  The way the code was written in Java, there's no way a automated
tool could figure that out on its own, unless you had some sort of way to
tell it what to do before hand.

I imagine porting it by hand is probably easier, though it would be nice if
there was a tool that would at least convert the syntax from Java to C#, as
well as changing the naming scheme to a .NET compatible one.  However, that
only really helps if you're porting classes from scratch.  It could, also,
hide bugs, since it's possible, however unlikely, something could port
perfectly, but not behave the same way.

A class that has many calls to string.Substring is a good example of this.
 If the name of the function is changed to the .Net version (.substring to
.Substring), it would compile no problems, but they are very different.
 C#'s signatures is Substring(int start, int count) while Java's is
Substring(int startIndex, int endIndex).  It may work hiding issues, it may
throw an exception, depending on the data.  A porting tool would probably
know many of the differences like this, so it's sorta a moot point, in that
this relies on the skills of the developer anyway.

I may be wrong, but I just don't see this being a fully automated process
ever.  I would love to have something automated that at least fixed syntax
errors, though this would only work on a line-by-line port.  (Slightly off
topic, I think we should always have a line-by-line port, even if our
primary goals become focusing on a fully .Net style port)  Either way, any
sort of manual or partly-automated process would still require a lot of
work to make sure things are ported correctly.  I also think it's most
manageable if it were a tool that did it on a file per file basis (instead
of project level like Sharpen), for easy review and testing.


Thanks,
Christopher

On Mon, Nov 21, 2011 at 3:30 PM, Scott Lombard lombardena...@gmail.comwrote:

 Chris,

 Now that you have spent some time dealing with the porting what is your
 view
 on creating a fully automated porting tool?

 Scott

  -Original Message-
  From: Christopher Currens [mailto:currens.ch...@gmail.com]
  Sent: Monday, November 21, 2011 5:23 PM
  To: lucene-net-dev@lucene.apache.org
  Subject: Re: [Lucene.Net] Roadmap
 
  Digy,
 
  No worries.  I wasn't taking them personally.  You've been
  doing this for a lot longer than I have, but I didn't
  understand you pain until I had to go through it personally. :P
 
  Have you looked at Contrib in a while?  There's a lot of
  projects that are in Java's Contrib that are not in
  Lucene.Net?  Is this because there are some that can't easily
  (if at all) be ported over to .NET or just because they've
  been neglected?  I'm trying to get a handle on what's
  important to port and what isn't.  Figured someone with
  experience could help me with a starting point over deciding
  where to start with everything that's missing.
 
 
  Thanks,
  Christopher
 
  On Mon, Nov 21, 2011 at 2:13 PM, Digy digyd...@gmail.com wrote:
 
  
   Chris,
  
   Sorry, if you took my comments about pain of porting personally.
   That wasn't my intension.
  
   +1 for all your changes/divergences. I made/could have made
  them too.
  
   DIGY
  
   -Original Message-
   From: Christopher Currens [mailto:currens.ch...@gmail.com]
   Sent: Monday, November 21, 2011 11:45 PM
   To: lucene-net-dev@lucene.apache.org
   Subject: Re: [Lucene.Net] Roadmap
  
   Digy,
  
   I used 2.9.4 trunk as the base for the 3.0.3 branch, but I
  looked to
   the code in 2.9.4g as a reference for many things, particularly the
   Support classes.  We hit many of the same issues I'm sure, I moved
   some of the anonymous classes into a base class where you
  could inject
   functions, though not all could be replaced, nor did I replace all
   that could have been.  Some of our code is different, I
  went for the
   option for WeakDictionary to be completely generic, as in
  wrapping a
   generic dictionary with WeakKeyT instead of wrapping the already
   existing WeakHashTable in support.  In hindsight, it may have just
   been easier to convert the WeakHashTable to generic, but alas, I'm
   only realizing that now.  There is a problem with my
  WeakDictionary,
   specifically the function that determines when to clean/compact the
   dictionary and remove the dead keys.  I need a better heuristic of
   deciding when to run the clean.  That's

Re: [Lucene.Net] Roadmap

2011-11-21 Thread Christopher Currens
 pass.  There were some minor exceptions, the
ThaiAnalyzer and hyphenation analyzers that could not be ported,
ThaiAnalyzer because it relies on BreakIterator, and there's no built-in
functionality to split a string by words based on a culture in .NET, and no
third party library I could find that easily does it, and Hyphenation,
because it relies on SAX xml processing, which is also missing from .NET.

The FastVectorHighlighter project has also had all 3.0.3 changes ported to
the project and it's Tests, as well, all passing.  All other projects in
contrib have yet to be touched/ported.

You can find some of my notes scattered about in // TODO comments, but most
centralized in the project directories:

src\core\FileDiffs.txt
src\core\ChangeNotes.txt
src\contrib\Analyzers\FileDiffs.txt
test\core\UpdatedTests.txt
test\contrib\analyzers\PortedTests.txt

If, and by if I mean when, you find porting errors, let me know and fix
them or have me fix them, or whatever you want to do.  The thing I worry
about the most are the tests for the collections I listed above, which I
will get around to writing soon.  I *have* found some porting issues in the
core dll that didn't manifest themselves in the Lucene.Net.Test test cases,
but did when I ported some of the tests for Contrib.Analyzers.  I have a
feeling they will be found slowly and surely, but I feel that they are few
and far between.

If anyone wants to help on this branch, I'd welcome it, we would just need
to coordinate who is working on what, so we aren't porting the same thing
and wasting time.

Thanks,
Christopher

TL;DL: Lucene.Net/Lucene.Net.Tests have all been ported to 3.0.3 (with a
few very minor exceptions), Contrib.Analyzers/Contrib.Analyzer.Test have
all been ported to 3.0.3 (few minor exceptions),
FastVectorHighlighter/FastVectorHighlighter.Tests have all been ported to
3.0.3, and the rest of Contrib is going to be a pain.

On Sun, Nov 20, 2011 at 11:44 AM, Prescott Nasser geobmx...@hotmail.comwrote:


 Anyone have any thoughts on these items?



 My 2 cents is that after we get 2.9.4 out the door, we quickly release a
 2.9.4g (Digy - you're probably most familiar with 2.9.4g, is there any work
 that we should do to that to get it solid for a release?



 I'm still unsure the status of 3.0.3 or 4.0, but I'm thinking for the next
 release in Q1 2012.







 
 
  While you all take a look at the artifacts for a vote - I wanted to talk
 about the future roadmap and our releases -
 
 
 
  2.9.4g is very stable - do we want to release this at some point?
 
  3.0.3 - chris looks to be pretty active on this. Chris, can you fill us
 in on what's the status of this branch?
 
  4.0 - looks to be partially underway.
 
 
 
  I want to try and maybe build a better release schedule and begin
 filling out what needs to be done so people can easily jump in and help
 out. I noticed the 4.0 status page in the wiki - that's excellent
 
 
 
  ~P



Re: [Lucene.Net] Roadmap

2011-11-21 Thread casper...@caspershouse.com
+1 on the suggestion to move Close - IDisposable; not being able to use 
using is such a pain, and an eyesore on the code.


Although it will have to be done properly, and not just have Dispose call 
Close (you should have proper protected virtual Dispose methods to take 
inheritance into account, etc).


- Nick



From: Christopher Currens currens.ch...@gmail.com

Sent: Monday, November 21, 2011 2:56 PM

To: lucene-net-...@lucene.apache.org

Subject: Re: [Lucene.Net] Roadmap


Regarding the 3.0.3 branch I started last week, I've put in a lot of late

nights and gotten far more done in a week and a half than I expected.  The

list of changes is very large, and fortunately, I've documented it in some

files that are in the branches root of certain projects.  I'll list what

changes have been made so far, and some of the concerns I have about them,

as well as what still needs to be done.  You can read them all in detail 
in

the files that are in the branch.


All changes in 3.0.3 have been ported to the Lucene.Net and

Lucene.Net.Test, except BooleanClause, LockStressTest, MMapDirectory,

NIOFSDirectory, DummyConcurrentLock, NamedThreadFactory, and

ThreadInterruptedException.


MMapDirectory and NIOFSDirectory have never been ported in the first place

for 2.9.4, so I'm not worried about those.  LockStressTest is a

command-line tool, porting it should be easy, but not essential to a 3.0.3

release, IMO.  DummyConcurrentLock also seems unnecessary (and

non-portable) for .NET, since it's based around Java's Lock class and is

only used to bypass locking, which can be done by passing new Object() to

the method.

NamedThreadFactory I'm unsure about.  It's used in ParallelMultiSearcher

(in which I've opted to use the TPL), and seems to be only used for

debugging, possibly testing.  Either way, I'm not sure it's necessary.

Also, named threads would mean we probably would have to move the class

from the TPL, which greatly simplified the code and parallelization of it

all, as I can't see a way to Set names for a Task.  I suppose it might be

possible, as Tasks have unique Ids, and you could use a Dictionary to map

the thread's name to the ID in the factory, but you'd have to create a

helper function that would allow you to find a task by its name, which

seems more work than the resulting benefits.  VS2010 already has better

support for debugging tasks over threads (I used it when writing the

class), frankly, it's amazing how easy it was to debug.


Other than the above, the entire code base in the core dlls is at 3.0.3,

which is exciting, as I'm really hoping we can get Lucene.Net up to the

current version of Java's 3.x branch, and start working on a line-by-line

port of 4.0.  Tests need to be written for some of the collections I've

made that emulate Java's, to make sure they're even behaving the same way.

The good news is that all of the existing tests pass as a whole, so it

seems to be working, though I'd like the peace of mind of having tests for

them (being HashMapTKey, TValue, WeakDictionaryTKey, TValue and

IdentityCollectionTKey, TValue, it's quite possible any one of them 
could

be completely wrong in how they were put together.)


I'd also like to finally formalize the way we use IDisposable in

Lucene.Net, by marking the Close functions as obsolete, moving the code

into Dispose, and eventually (or immediately) removing the Close 
functions.

There's so much change to the API, that now would be a good time to make

that change if we wanted to.  I'm hesitant to move from a line-by-line 
port

of Lucene.Net completely, but rather having it be close as possible.  The

main reason I feel this way, is when I was porting the Shingle namespace 
of

Contrib.Analyzers, Troy has written it in a .Net way which different

GREATLY from java lucene, and it did make porting it considerably more

difficult; to keep the language to a minimum, I'm just going to say it was

a pain, a huge pain in fact.  I love the idea of moving to a more .NET

design, but I'd like to maintain a line-by-line port anyway, as I think

porting changes is far easier and quicker that way.  At this point, I'm

more interested in getting Lucene.Net to 4.0 and caught up to java, than I

am anything else, hence the extra amount of time I've put into this 
project

over the past week and a half.  Though this isn't really a place for this

discussion.


The larger area of difficult for the port, however, is the Contrib 
section.

There are two major problems with it that is slowing me down.  First,

there are a lot of classes that are outdated.  I've found versions of code

that still have the Apache 1.1 License attached to it, which makes the 
code

quite old.  Also, it was almost impossible for me to port a lot of changes

in Contrib.Analyzers, since the code was so old and different from Java's

2.9.4.


Second, we had almost no unit tests ported for any of the classes, which

means

Re: [Lucene.Net] Roadmap

2011-11-21 Thread Christopher Currens
Some of the Lucene classes have Dispose methods, well, ones that call Close
(and that Close method may or may not call base.Close(), if needed or not).
 Virtual dispose methods can be dangerous only in that they're easy to
implement wrong.  However, it shouldn't be too bad, at least with a
line-by-line port, as we would make the call to the base class whenever
Lucene does, and that would (should) give us the same behavior, implemented
properly.  I'm not aware of differences in the JVM, regarding inheritance
and base methods being called automatically, particularly Close methods.

Slightly unrelated, another annoyance is the use of Java Iterators vs C#
Enumerables.  A lot of our code is there simply because there are
Iterators, but it could be converted to Enumerables. The whole HasNext,
Next vs C#'s MoveNext(), Current is annoying, but it's used all over in the
base code, and would have to be changed there as well.  Either way, I would
like to push for that before 3.0.3 is relased.  IMO, small changes like
this still keep the code similar to the line-by-line port, in that it
doesn't add any difficulties in the porting process, but provides great
benefits to the users of the code, to have a .NET centric API.  I don't
think it would violate our project desciption we have listed on our
Incubator page, either.


Thanks,
Christopher

On Mon, Nov 21, 2011 at 12:03 PM, casper...@caspershouse.com 
casper...@caspershouse.com wrote:

 +1 on the suggestion to move Close - IDisposable; not being able to use
 using is such a pain, and an eyesore on the code.


 Although it will have to be done properly, and not just have Dispose call
 Close (you should have proper protected virtual Dispose methods to take
 inheritance into account, etc).


 - Nick

 

 From: Christopher Currens currens.ch...@gmail.com

 Sent: Monday, November 21, 2011 2:56 PM

 To: lucene-net-...@lucene.apache.org

 Subject: Re: [Lucene.Net] Roadmap


 Regarding the 3.0.3 branch I started last week, I've put in a lot of late

 nights and gotten far more done in a week and a half than I expected.  The

 list of changes is very large, and fortunately, I've documented it in some

 files that are in the branches root of certain projects.  I'll list what

 changes have been made so far, and some of the concerns I have about them,

 as well as what still needs to be done.  You can read them all in detail
 in

 the files that are in the branch.


 All changes in 3.0.3 have been ported to the Lucene.Net and

 Lucene.Net.Test, except BooleanClause, LockStressTest, MMapDirectory,

 NIOFSDirectory, DummyConcurrentLock, NamedThreadFactory, and

 ThreadInterruptedException.


 MMapDirectory and NIOFSDirectory have never been ported in the first place

 for 2.9.4, so I'm not worried about those.  LockStressTest is a

 command-line tool, porting it should be easy, but not essential to a 3.0.3

 release, IMO.  DummyConcurrentLock also seems unnecessary (and

 non-portable) for .NET, since it's based around Java's Lock class and is

 only used to bypass locking, which can be done by passing new Object() to

 the method.

 NamedThreadFactory I'm unsure about.  It's used in ParallelMultiSearcher

 (in which I've opted to use the TPL), and seems to be only used for

 debugging, possibly testing.  Either way, I'm not sure it's necessary.

 Also, named threads would mean we probably would have to move the class

 from the TPL, which greatly simplified the code and parallelization of it

 all, as I can't see a way to Set names for a Task.  I suppose it might be

 possible, as Tasks have unique Ids, and you could use a Dictionary to map

 the thread's name to the ID in the factory, but you'd have to create a

 helper function that would allow you to find a task by its name, which

 seems more work than the resulting benefits.  VS2010 already has better

 support for debugging tasks over threads (I used it when writing the

 class), frankly, it's amazing how easy it was to debug.


 Other than the above, the entire code base in the core dlls is at 3.0.3,

 which is exciting, as I'm really hoping we can get Lucene.Net up to the

 current version of Java's 3.x branch, and start working on a line-by-line

 port of 4.0.  Tests need to be written for some of the collections I've

 made that emulate Java's, to make sure they're even behaving the same way.

 The good news is that all of the existing tests pass as a whole, so it

 seems to be working, though I'd like the peace of mind of having tests for

 them (being HashMapTKey, TValue, WeakDictionaryTKey, TValue and

 IdentityCollectionTKey, TValue, it's quite possible any one of them
 could

 be completely wrong in how they were put together.)


 I'd also like to finally formalize the way we use IDisposable in

 Lucene.Net, by marking the Close functions as obsolete, moving the code

 into Dispose, and eventually (or immediately) removing the Close
 functions.

 There's so much change

Re: [Lucene.Net] Roadmap

2011-11-21 Thread casper...@caspershouse.com


Christopher,


I'd say there not that hard to get wrong, the pattern for correctly 
implementing the IDisposable interface is well-established and has been 
common practice since .NET 1.0:


http://msdn.microsoft.com/en-us/library/b1yfkh5e(v=VS.100).aspx


Additionally, I said protected virtual (as per the recommendation in the 
link above).


Also agreed on the use of iterators everywhere.  Foreach is your friend.


What would be even better in some cases, using yield return, as I'm sure 
result sets don't need to be materialized everywhere as they are now.


- Nick



From: Christopher Currens currens.ch...@gmail.com

Sent: Monday, November 21, 2011 3:18 PM

To: lucene-net-...@lucene.apache.org, casper...@caspershouse.com

Subject: Re: [Lucene.Net] Roadmap


Some of the Lucene classes have Dispose methods, well, ones that call Close 
(and that Close method may or may not call base.Close(), if needed or not). 
 Virtual dispose methods can be dangerous only in that they're easy to 
implement wrong.  However, it shouldn't be too bad, at least with a 
line-by-line port, as we would make the call to the base class whenever 
Lucene does, and that would (should) give us the same behavior, implemented 
properly.  I'm not aware of differences in the JVM, regarding inheritance 
and base methods being called automatically, particularly Close methods.

Slightly unrelated, another annoyance is the use of Java Iterators vs C# 
Enumerables.  A lot of our code is there simply because there are 
Iterators, but it could be converted to Enumerables. The whole HasNext, 
Next vs C#'s MoveNext(), Current is annoying, but it's used all over in the 
base code, and would have to be changed there as well.  Either way, I would 
like to push for that before 3.0.3 is relased.  IMO, small changes like 
this still keep the code similar to the line-by-line port, in that it 
doesn't add any difficulties in the porting process, but provides great 
benefits to the users of the code, to have a .NET centric API.  I don't 
think it would violate our project desciption we have listed on our 
Incubator page, either.

Thanks,
Christopher


On Mon, Nov 21, 2011 at 12:03 PM, casper...@caspershouse.com 
casper...@caspershouse.com wrote:

+1 on the suggestion to move Close - IDisposable; not being able to use

using is such a pain, and an eyesore on the code.


Although it will have to be done properly, and not just have Dispose call

Close (you should have proper protected virtual Dispose methods to take

inheritance into account, etc).


- Nick





From: Christopher Currens currens.ch...@gmail.com


Sent: Monday, November 21, 2011 2:56 PM


To: lucene-net-...@lucene.apache.org


Subject: Re: [Lucene.Net] Roadmap


Regarding the 3.0.3 branch I started last week, I've put in a lot of late


nights and gotten far more done in a week and a half than I expected.  The


list of changes is very large, and fortunately, I've documented it in some


files that are in the branches root of certain projects.  I'll list what


changes have been made so far, and some of the concerns I have about them,


as well as what still needs to be done.  You can read them all in detail

in


the files that are in the branch.


All changes in 3.0.3 have been ported to the Lucene.Net and


Lucene.Net.Test, except BooleanClause, LockStressTest, MMapDirectory,


NIOFSDirectory, DummyConcurrentLock, NamedThreadFactory, and


ThreadInterruptedException.


MMapDirectory and NIOFSDirectory have never been ported in the first place


for 2.9.4, so I'm not worried about those.  LockStressTest is a


command-line tool, porting it should be easy, but not essential to a 3.0.3


release, IMO.  DummyConcurrentLock also seems unnecessary (and


non-portable) for .NET, since it's based around Java's Lock class and is


only used to bypass locking, which can be done by passing new Object() to


the method.


NamedThreadFactory I'm unsure about.  It's used in ParallelMultiSearcher


(in which I've opted to use the TPL), and seems to be only used for


debugging, possibly testing.  Either way, I'm not sure it's necessary.


Also, named threads would mean we probably would have to move the class


from the TPL, which greatly simplified the code and parallelization of it


all, as I can't see a way to Set names for a Task.  I suppose it might be


possible, as Tasks have unique Ids, and you could use a Dictionary to map


the thread's name to the ID in the factory, but you'd have to create a


helper function that would allow you to find a task by its name, which


seems more work than the resulting benefits.  VS2010 already has better


support for debugging tasks over threads (I used it when writing the


class), frankly, it's amazing how easy it was to debug.


Other than the above, the entire code base in the core dlls is at 3.0.3,


which is exciting, as I'm really hoping we can get Lucene.Net up

RE: [Lucene.Net] Roadmap

2011-11-21 Thread Digy
Hi Chris,

First of all, thank you for your great work on 3.0.3 branch. 
I suppose you took 2.9.4 as a code base to make 3.0.3 port since some of
your problems are the same with those I faced in 2.9.4g branch. 
(e.g, 
Support/MemoryMappedDirectory.cs (but never used in core), 
IDisposable, 
introduction of some ActionTs, FuncTs , 
foreach instead of GetEnumerator/MoveNext,
IEquatableT,
WeakDictionaryT,
SetT
etc.
)

Since I also used 3.0.3 as a reference, maybe we can use some of 2.9.4g's
code in 3.0.3 when necessary(I haven't had time to look into 3.0.3 deeply)

Just to ensure the coordination, maybe you should create a new issue in
JIRA, so that people send patches to that issue instead of directly
commiting.


@Prescott,
2.9.4g is not behind of 2.9.4 in bug fixes  features level. So, It is (I
think) ready for another release.(I use it in all my projects since long).


PS: Hearing the pain of porting codes that greatly differ from Java made
me just smile( sorry for that:( ). Be ready for responses that get beyond
the criticism between With all due respect  Just my $0.02 paranthesis.

DIGY

-Original Message-
From: Christopher Currens [mailto:currens.ch...@gmail.com] 
Sent: Monday, November 21, 2011 10:19 PM
To: lucene-net-...@lucene.apache.org; casper...@caspershouse.com
Subject: Re: [Lucene.Net] Roadmap

Some of the Lucene classes have Dispose methods, well, ones that call Close
(and that Close method may or may not call base.Close(), if needed or not).
 Virtual dispose methods can be dangerous only in that they're easy to
implement wrong.  However, it shouldn't be too bad, at least with a
line-by-line port, as we would make the call to the base class whenever
Lucene does, and that would (should) give us the same behavior, implemented
properly.  I'm not aware of differences in the JVM, regarding inheritance
and base methods being called automatically, particularly Close methods.

Slightly unrelated, another annoyance is the use of Java Iterators vs C#
Enumerables.  A lot of our code is there simply because there are
Iterators, but it could be converted to Enumerables. The whole HasNext,
Next vs C#'s MoveNext(), Current is annoying, but it's used all over in the
base code, and would have to be changed there as well.  Either way, I would
like to push for that before 3.0.3 is relased.  IMO, small changes like
this still keep the code similar to the line-by-line port, in that it
doesn't add any difficulties in the porting process, but provides great
benefits to the users of the code, to have a .NET centric API.  I don't
think it would violate our project desciption we have listed on our
Incubator page, either.


Thanks,
Christopher

On Mon, Nov 21, 2011 at 12:03 PM, casper...@caspershouse.com 
casper...@caspershouse.com wrote:

 +1 on the suggestion to move Close - IDisposable; not being able to use
 using is such a pain, and an eyesore on the code.


 Although it will have to be done properly, and not just have Dispose call
 Close (you should have proper protected virtual Dispose methods to take
 inheritance into account, etc).


 - Nick

 

 From: Christopher Currens currens.ch...@gmail.com

 Sent: Monday, November 21, 2011 2:56 PM

 To: lucene-net-...@lucene.apache.org

 Subject: Re: [Lucene.Net] Roadmap


 Regarding the 3.0.3 branch I started last week, I've put in a lot of late

 nights and gotten far more done in a week and a half than I expected.  The

 list of changes is very large, and fortunately, I've documented it in some

 files that are in the branches root of certain projects.  I'll list what

 changes have been made so far, and some of the concerns I have about them,

 as well as what still needs to be done.  You can read them all in detail
 in

 the files that are in the branch.


 All changes in 3.0.3 have been ported to the Lucene.Net and

 Lucene.Net.Test, except BooleanClause, LockStressTest, MMapDirectory,

 NIOFSDirectory, DummyConcurrentLock, NamedThreadFactory, and

 ThreadInterruptedException.


 MMapDirectory and NIOFSDirectory have never been ported in the first place

 for 2.9.4, so I'm not worried about those.  LockStressTest is a

 command-line tool, porting it should be easy, but not essential to a 3.0.3

 release, IMO.  DummyConcurrentLock also seems unnecessary (and

 non-portable) for .NET, since it's based around Java's Lock class and is

 only used to bypass locking, which can be done by passing new Object() to

 the method.

 NamedThreadFactory I'm unsure about.  It's used in ParallelMultiSearcher

 (in which I've opted to use the TPL), and seems to be only used for

 debugging, possibly testing.  Either way, I'm not sure it's necessary.

 Also, named threads would mean we probably would have to move the class

 from the TPL, which greatly simplified the code and parallelization of it

 all, as I can't see a way to Set names for a Task.  I suppose

Re: [Lucene.Net] Roadmap

2011-11-21 Thread Troy Howard
So, if we're getting back to the line by line port discussion... I
think either side of this discussion is too extreme. For the case in
point Chris just mentioned (which I'm not really sure what part was so
difficult, as I ported that library in about 30 minutes from
scratch)... anything is a pain if it sticks out in the middle of doing
something completely different.

The only reason we are able to do this line by line is due to the
general similarity between Java and C#'s language syntax. If we were
porting Lucene to a completely different language, that had a totally
different syntax, the process would go like this:

- Look at the original code, understand it's intent
- Create similar code in the new language that expresses the same intent

When applying changes:

- Look at the original code diffs, understanding the intent of the change
- Look at the ported code, and apply the changed logic's meaning in
that language

So, is just a different thought process. In my opinion, it's a better
process because it forces the developer to actually think about the
code instead of blindly converting syntax (possibly slightly
incorrectly and introducing regressions). While there is a large
volume of unit tests in Lucene, they are unfortunately not really the
right tests and make porting much more difficult, because it's hard to
verify that your ported code behaves the same because you can't just
rely on the unit tests to verify your port. Therefore, it's safer to
follow a process that requires the developer to delve deeply into the
meaning of the code. Following a line-by-line process is convenient,
but doesn't focus on meaning, which I think is more important.

Thanks,
Troy

On Mon, Nov 21, 2011 at 2:23 PM, Christopher Currens
currens.ch...@gmail.com wrote:
 Digy,

 No worries.  I wasn't taking them personally.  You've been doing this for a
 lot longer than I have, but I didn't understand you pain until I had to go
 through it personally. :P

 Have you looked at Contrib in a while?  There's a lot of projects that are
 in Java's Contrib that are not in Lucene.Net?  Is this because there are
 some that can't easily (if at all) be ported over to .NET or just because
 they've been neglected?  I'm trying to get a handle on what's important to
 port and what isn't.  Figured someone with experience could help me with a
 starting point over deciding where to start with everything that's missing.


 Thanks,
 Christopher

 On Mon, Nov 21, 2011 at 2:13 PM, Digy digyd...@gmail.com wrote:


 Chris,

 Sorry, if you took my comments about pain of porting personally. That
 wasn't my intension.

 +1 for all your changes/divergences. I made/could have made them too.

 DIGY

 -Original Message-
 From: Christopher Currens [mailto:currens.ch...@gmail.com]
 Sent: Monday, November 21, 2011 11:45 PM
 To: lucene-net-...@lucene.apache.org
 Subject: Re: [Lucene.Net] Roadmap

 Digy,

 I used 2.9.4 trunk as the base for the 3.0.3 branch, but I looked to the
 code in 2.9.4g as a reference for many things, particularly the Support
 classes.  We hit many of the same issues I'm sure, I moved some of the
 anonymous classes into a base class where you could inject functions,
 though not all could be replaced, nor did I replace all that could have
 been.  Some of our code is different, I went for the option for
 WeakDictionary to be completely generic, as in wrapping a generic
 dictionary with WeakKeyT instead of wrapping the already existing
 WeakHashTable in support.  In hindsight, it may have just been easier to
 convert the WeakHashTable to generic, but alas, I'm only realizing that
 now.  There is a problem with my WeakDictionary, specifically the function
 that determines when to clean/compact the dictionary and remove the dead
 keys.  I need a better heuristic of deciding when to run the clean.  That's
 a performance issue though.

 Regarding the pain of porting, I am a changed man.  It's nice, in a sad
 way, to know that I'm not the only one who experienced those difficulties.
  I used to be in the camp that porting code that differed from java
 wouldn't be difficult at all.  However, now I code corrected!  It threw me
 a curve-ball, for sure.  I DO think a line-by-line port can definitely
 include the things talked about below, ie the changes to Dispose and the
 changes to IEnumerableT.  Those changes, I thing, can be made without a
 heavy impact on the porting process.

 There was one fairly large change I opted to use that differed quite a bit
 from Java, however, and that was the use of the TPL in
 ParallelMultiSearcher.  It was far easier to port this way, and I don't
 think it affects the porting process too much.  Java uses a helper class
 defined at the bottom of the source file that handles it, I'm simply using
 a built-in one instead.  I just need to be careful about it, it would be
 really easy to get carried away with it.


 Thanks,
 Christopher

 On Mon, Nov 21, 2011 at 1:20 PM, Digy digyd...@gmail.com wrote:

  Hi Chris

RE: [Lucene.Net] Roadmap

2011-11-21 Thread Digy
Troy,
I am not againt it if you can continue to understand and port so easyly. 
No one here -I think- wants a java-tastes code.

DIGY

-Original Message-
From: Troy Howard [mailto:thowar...@gmail.com] 
Sent: Tuesday, November 22, 2011 12:42 AM
To: lucene-net-...@lucene.apache.org
Subject: Re: [Lucene.Net] Roadmap

So, if we're getting back to the line by line port discussion... I
think either side of this discussion is too extreme. For the case in
point Chris just mentioned (which I'm not really sure what part was so
difficult, as I ported that library in about 30 minutes from
scratch)... anything is a pain if it sticks out in the middle of doing
something completely different.

The only reason we are able to do this line by line is due to the
general similarity between Java and C#'s language syntax. If we were
porting Lucene to a completely different language, that had a totally
different syntax, the process would go like this:

- Look at the original code, understand it's intent
- Create similar code in the new language that expresses the same intent

When applying changes:

- Look at the original code diffs, understanding the intent of the change
- Look at the ported code, and apply the changed logic's meaning in
that language

So, is just a different thought process. In my opinion, it's a better
process because it forces the developer to actually think about the
code instead of blindly converting syntax (possibly slightly
incorrectly and introducing regressions). While there is a large
volume of unit tests in Lucene, they are unfortunately not really the
right tests and make porting much more difficult, because it's hard to
verify that your ported code behaves the same because you can't just
rely on the unit tests to verify your port. Therefore, it's safer to
follow a process that requires the developer to delve deeply into the
meaning of the code. Following a line-by-line process is convenient,
but doesn't focus on meaning, which I think is more important.

Thanks,
Troy

On Mon, Nov 21, 2011 at 2:23 PM, Christopher Currens
currens.ch...@gmail.com wrote:
 Digy,

 No worries.  I wasn't taking them personally.  You've been doing this for a
 lot longer than I have, but I didn't understand you pain until I had to go
 through it personally. :P

 Have you looked at Contrib in a while?  There's a lot of projects that are
 in Java's Contrib that are not in Lucene.Net?  Is this because there are
 some that can't easily (if at all) be ported over to .NET or just because
 they've been neglected?  I'm trying to get a handle on what's important to
 port and what isn't.  Figured someone with experience could help me with a
 starting point over deciding where to start with everything that's missing.


 Thanks,
 Christopher

 On Mon, Nov 21, 2011 at 2:13 PM, Digy digyd...@gmail.com wrote:


 Chris,

 Sorry, if you took my comments about pain of porting personally. That
 wasn't my intension.

 +1 for all your changes/divergences. I made/could have made them too.

 DIGY

 -Original Message-
 From: Christopher Currens [mailto:currens.ch...@gmail.com]
 Sent: Monday, November 21, 2011 11:45 PM
 To: lucene-net-...@lucene.apache.org
 Subject: Re: [Lucene.Net] Roadmap

 Digy,

 I used 2.9.4 trunk as the base for the 3.0.3 branch, but I looked to the
 code in 2.9.4g as a reference for many things, particularly the Support
 classes.  We hit many of the same issues I'm sure, I moved some of the
 anonymous classes into a base class where you could inject functions,
 though not all could be replaced, nor did I replace all that could have
 been.  Some of our code is different, I went for the option for
 WeakDictionary to be completely generic, as in wrapping a generic
 dictionary with WeakKeyT instead of wrapping the already existing
 WeakHashTable in support.  In hindsight, it may have just been easier to
 convert the WeakHashTable to generic, but alas, I'm only realizing that
 now.  There is a problem with my WeakDictionary, specifically the function
 that determines when to clean/compact the dictionary and remove the dead
 keys.  I need a better heuristic of deciding when to run the clean.  That's
 a performance issue though.

 Regarding the pain of porting, I am a changed man.  It's nice, in a sad
 way, to know that I'm not the only one who experienced those difficulties.
  I used to be in the camp that porting code that differed from java
 wouldn't be difficult at all.  However, now I code corrected!  It threw me
 a curve-ball, for sure.  I DO think a line-by-line port can definitely
 include the things talked about below, ie the changes to Dispose and the
 changes to IEnumerableT.  Those changes, I thing, can be made without a
 heavy impact on the porting process.

 There was one fairly large change I opted to use that differed quite a bit
 from Java, however, and that was the use of the TPL in
 ParallelMultiSearcher.  It was far easier to port this way, and I don't
 think it affects the porting process too

Re: [Lucene.Net] Roadmap

2011-11-21 Thread Christopher Currens
To clarify, it wasn't as much *difficult* as it was more *painful*.  Above,
I was inferring that it was more difficult that the rest of the code, which
by comparison was easier.  It wasn't painless to try and map where code
changes were from the java classes into the .Net version.  I prefer that
style more for its readability and the niceties of working with a .Net
style of Lucene, however as I said before, it slowed down significantly the
porting process.  I hope it didn't come across that I thought that it was
bad code, because it's probably the most readable code we have in the
Contrib at the moment.

I want to make it clear that my intention right now is to get Lucene.Net up
to date with Java.  When I read the Java code, I understand its intent, and
I make sure the ported code represents it.  That takes enough time as it
is, moving to try and figure out where the code went in Lucene.Net, since
it wasn't a 1-1 map, was a MINOR annoyance, especially when you compare it
to the issues I had dealing with the differences between the two languages,
generics especialy.  That being said, I don't have a problem with code
being converted in a .Net idiomatic way, in fact, I welcome it, if it still
allows the changes to be ported with minimal effort.  I feel at this point
in the project, there are some limitations to how far I'd like it to
diverge.

Anyway, my opinion, which may not be in agreement with the group as a
whole, is that it would be better to bring the codebase up to date, or at
least more up to date with java's, and then maintaining a version with a
complete .net-concentric API.  I feel this would beeasier, as porting
Java's Lucene SVN commits by the week would be a relatively small workload.

On Mon, Nov 21, 2011 at 2:41 PM, Troy Howard thowar...@gmail.com wrote:

 So, if we're getting back to the line by line port discussion... I
 think either side of this discussion is too extreme. For the case in
 point Chris just mentioned (which I'm not really sure what part was so
 difficult, as I ported that library in about 30 minutes from
 scratch)... anything is a pain if it sticks out in the middle of doing
 something completely different.

 The only reason we are able to do this line by line is due to the
 general similarity between Java and C#'s language syntax. If we were
 porting Lucene to a completely different language, that had a totally
 different syntax, the process would go like this:

 - Look at the original code, understand it's intent
 - Create similar code in the new language that expresses the same intent

 When applying changes:

 - Look at the original code diffs, understanding the intent of the change
 - Look at the ported code, and apply the changed logic's meaning in
 that language

 So, is just a different thought process. In my opinion, it's a better
 process because it forces the developer to actually think about the
 code instead of blindly converting syntax (possibly slightly
 incorrectly and introducing regressions). While there is a large
 volume of unit tests in Lucene, they are unfortunately not really the
 right tests and make porting much more difficult, because it's hard to
 verify that your ported code behaves the same because you can't just
 rely on the unit tests to verify your port. Therefore, it's safer to
 follow a process that requires the developer to delve deeply into the
 meaning of the code. Following a line-by-line process is convenient,
 but doesn't focus on meaning, which I think is more important.

 Thanks,
 Troy

 On Mon, Nov 21, 2011 at 2:23 PM, Christopher Currens
 currens.ch...@gmail.com wrote:
  Digy,
 
  No worries.  I wasn't taking them personally.  You've been doing this
 for a
  lot longer than I have, but I didn't understand you pain until I had to
 go
  through it personally. :P
 
  Have you looked at Contrib in a while?  There's a lot of projects that
 are
  in Java's Contrib that are not in Lucene.Net?  Is this because there are
  some that can't easily (if at all) be ported over to .NET or just because
  they've been neglected?  I'm trying to get a handle on what's important
 to
  port and what isn't.  Figured someone with experience could help me with
 a
  starting point over deciding where to start with everything that's
 missing.
 
 
  Thanks,
  Christopher
 
  On Mon, Nov 21, 2011 at 2:13 PM, Digy digyd...@gmail.com wrote:
 
 
  Chris,
 
  Sorry, if you took my comments about pain of porting personally. That
  wasn't my intension.
 
  +1 for all your changes/divergences. I made/could have made them too.
 
  DIGY
 
  -Original Message-
  From: Christopher Currens [mailto:currens.ch...@gmail.com]
  Sent: Monday, November 21, 2011 11:45 PM
  To: lucene-net-...@lucene.apache.org
  Subject: Re: [Lucene.Net] Roadmap
 
  Digy,
 
  I used 2.9.4 trunk as the base for the 3.0.3 branch, but I looked to the
  code in 2.9.4g as a reference for many things, particularly the Support
  classes.  We hit many of the same issues I'm sure, I moved some

Re: Pruning down roadmap for 3.2

2011-05-22 Thread Shai Erera

 my point is just that when bulk editing these things,
 we should just un-set instead of rolling forward


+1. Before 3.1 I reviewed the list of issues that were not updated since Jan
1st 2009, and closed most of them (those that looked like they're either
solved or not of interest anymore). Some issues that were updated post that
date were not commented on, or had a patch update or something, but their
update date attribute was affected by version changes and the like. So if we
unset the version flag going forward, it will be clearer (and easier) to
determine which issues are really old and can be closed.

: personally, when i set these versions on issues, i set versions
 : 3.x+4.0 usually to indicate that I think its applicable to both the
 : stable branch and trunk (versus 4.0-only indicating its not the kind
 : of thing that should be backported).

 that makes sense too -- personally I think we should be stricter about it
 ... we should only mark something as fix for 3.3 if we think (at the time
 we are reviewing/updating the issue) that we really shouldn't release 3.3
 w/o this fix/feature.


Actually both approaches make sense to me :). I personally set the version
flag of every issue I report, but I will try to avoid that, and set the flag
for issues I'm sure I'll have time to work on for the next release, or are
important enough for the next release. Otherwise, important issues will fall
between the cracks and forgotten (until one day my giant old-issues-sweeping
broom will get rid of them :)).

Shai

On Fri, May 20, 2011 at 10:27 PM, Chris Hostetter
hossman_luc...@fucit.orgwrote:

 : personally, when i set these versions on issues, i set versions
 : 3.x+4.0 usually to indicate that I think its applicable to both the
 : stable branch and trunk (versus 4.0-only indicating its not the kind
 : of thing that should be backported).

 that makes sense too -- personally I think we should be stricter about it
 ... we should only mark something as fix for 3.3 if we think (at the time
 we are reviewing/updating the issue) that we really shouldn't release 3.3
 w/o this fix/feature.

 obviously things change ... one day we think something should definitely
 be in a release, two weeks later we might decide that it's too much work,
 or too complex to try and shoehorn in, and we conciously change it the
 version.

 I'm not suggesting that anyone change their current practice of how/when
 they choose what version to mark on a jira issue when they are reviewing a
 particular issue -- my point is just that when bulk editing these things,
 we should just un-set instead of rolling forward:  If some issues have
 fallen by the way-side to the point that we are bulk editing them out of
 blocking the current release, that's a pretty good indicator that they
 aren't a priority for the next release unless someone steps forward and
 deliberately says they should be.


 -Hoss

 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org




Pruning down roadmap for 3.2

2011-05-20 Thread Chris Hostetter


If we're serious about wanting to do a 3.2 release sometime in June, the 
first step is to get proactive abour pruning down the roadmap of issues 
that aren't going to make the cut.


Here's a list of every Unresolved fix=3.2 issue that has no assignee and 
no attachments (that hasn't been updatd in the last two days just in case 
something is super new and actively being worked on)...


https://issues.apache.org/jira/secure/IssueNavigator.jspa?reset=truejqlQuery=%28project+%3D+SOLR+OR+project+%3D+LUCENE%29+AND+NOT+updated+%3E%3D+2011-05-18+AND+resolution+%3D+Unresolved+AND+assignee+is+EMPTY+AND+fixVersion+%3D+%223.2%22+AND+%22Attachment+count%22+%3D+%220%22+ORDER+BY+issuetype+ASC%2C+updated+ASC%2C+key+DESC

...current count is 103.

I suggest that we un-mark these for 3.2 this weekend.  If you really feel 
strongly that something on this list needs to make it into 3.2, assign it 
to yourself.


The next step (in a week or so?) should probably be to audit the list of 
everything that doesn't have an assignee (even if it does have a patch) 
and un-mark those for 3.2.  if no one is willing to step up and say that 
they will massage/commit the patch, then either the patch isn't ready, or 
we aren't ready to support it in a release.


Objections?

(Note: In two days I'm leaving for hong kong and will be offline for 2.5 
weeks, so i won't taking the initiative to make these bulk edits my self, 
i'm just trying to help get the ball rolling)





-Hoss

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Pruning down roadmap for 3.2

2011-05-20 Thread Robert Muir
On Fri, May 20, 2011 at 2:59 PM, Chris Hostetter
hossman_luc...@fucit.org wrote:

 If we're serious about wanting to do a 3.2 release sometime in June, the
 first step is to get proactive abour pruning down the roadmap of issues that
 aren't going to make the cut.

 Here's a list of every Unresolved fix=3.2 issue that has no assignee and no
 attachments (that hasn't been updatd in the last two days just in case
 something is super new and actively being worked on)...

 https://issues.apache.org/jira/secure/IssueNavigator.jspa?reset=truejqlQuery=%28project+%3D+SOLR+OR+project+%3D+LUCENE%29+AND+NOT+updated+%3E%3D+2011-05-18+AND+resolution+%3D+Unresolved+AND+assignee+is+EMPTY+AND+fixVersion+%3D+%223.2%22+AND+%22Attachment+count%22+%3D+%220%22+ORDER+BY+issuetype+ASC%2C+updated+ASC%2C+key+DESC

 ...current count is 103.

 I suggest that we un-mark these for 3.2 this weekend.  If you really feel
 strongly that something on this list needs to make it into 3.2, assign it to
 yourself.

+1, maybe we should make a 3.3 in jira and move those to it?


 The next step (in a week or so?) should probably be to audit the list of
 everything that doesn't have an assignee (even if it does have a patch) and
 un-mark those for 3.2.  if no one is willing to step up and say that they
 will massage/commit the patch, then either the patch isn't ready, or we
 aren't ready to support it in a release.


+1, sounds like another good filter to apply

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Pruning down roadmap for 3.2

2011-05-20 Thread Chris Hostetter

:  I suggest that we un-mark these for 3.2 this weekend.  If you really feel
:  strongly that something on this list needs to make it into 3.2, assign it to
:  yourself.
: 
: +1, maybe we should make a 3.3 in jira and move those to it?

That's what we've done in the past, but i think in generally we should 
just stop doing that -- It doesn't feel like it really improves the 
situation.

Unless someone is actively working on an issue with a target to getting it 
into a specific version, we should just leave the version blank.

Having all of these issues pending for a version makes it hard to really 
see the realistic plan of what is likely to be in a given release. Some of 
these issues have been open for a long time, with no activity, and no 
indication that anone thinks they are important (let alone actively 
working on patches for them).  Hypotheticly: if they hadn't all been 
marked for 3.2, we might have done a release weeks ago if it had been 
clear that there was only a handful of open issues people were actively 
working on and wanted to get in 3.2 -- that would have likely motivated 
us to rally arround getting those issues finalized and making an RC.



-Hoss

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Pruning down roadmap for 3.2

2011-05-20 Thread Robert Muir
On Fri, May 20, 2011 at 3:14 PM, Chris Hostetter
hossman_luc...@fucit.org wrote:

 :  I suggest that we un-mark these for 3.2 this weekend.  If you really feel
 :  strongly that something on this list needs to make it into 3.2, assign it 
 to
 :  yourself.
 :
 : +1, maybe we should make a 3.3 in jira and move those to it?

 That's what we've done in the past, but i think in generally we should
 just stop doing that -- It doesn't feel like it really improves the
 situation.

 Unless someone is actively working on an issue with a target to getting it
 into a specific version, we should just leave the version blank.

 Having all of these issues pending for a version makes it hard to really
 see the realistic plan of what is likely to be in a given release. Some of
 these issues have been open for a long time, with no activity, and no
 indication that anone thinks they are important (let alone actively
 working on patches for them).  Hypotheticly: if they hadn't all been
 marked for 3.2, we might have done a release weeks ago if it had been
 clear that there was only a handful of open issues people were actively
 working on and wanted to get in 3.2 -- that would have likely motivated
 us to rally arround getting those issues finalized and making an RC.


that's definitely fine by me, especially if you think perhaps it would
encourage us to get releases out faster.

personally, when i set these versions on issues, i set versions
3.x+4.0 usually to indicate that I think its applicable to both the
stable branch and trunk (versus 4.0-only indicating its not the kind
of thing that should be backported).

but we can definitely just leave the thing blank as maybe this isn't
that useful and makes releases seem even more hopeless.

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Pruning down roadmap for 3.2

2011-05-20 Thread Chris Hostetter
: personally, when i set these versions on issues, i set versions
: 3.x+4.0 usually to indicate that I think its applicable to both the
: stable branch and trunk (versus 4.0-only indicating its not the kind
: of thing that should be backported).

that makes sense too -- personally I think we should be stricter about it 
... we should only mark something as fix for 3.3 if we think (at the time 
we are reviewing/updating the issue) that we really shouldn't release 3.3 
w/o this fix/feature.

obviously things change ... one day we think something should definitely 
be in a release, two weeks later we might decide that it's too much work, 
or too complex to try and shoehorn in, and we conciously change it the 
version.  

I'm not suggesting that anyone change their current practice of how/when 
they choose what version to mark on a jira issue when they are reviewing a 
particular issue -- my point is just that when bulk editing these things, 
we should just un-set instead of rolling forward:  If some issues have 
fallen by the way-side to the point that we are bulk editing them out of 
blocking the current release, that's a pretty good indicator that they 
aren't a priority for the next release unless someone steps forward and 
deliberately says they should be.


-Hoss

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Proposed Roadmap

2011-02-18 Thread Troy Howard
All,

Following up on Scott's post asking about JIRA issues and our
development road map, I've put together a more detailed idea of how
we could divide work, schedule releases, and clean up the backlog.

There will be at least four main areas of work to address in the
upcoming months:
- Project Maintenance
- Catching up with the backlog
- Working on a new porting system
- The Future: New API and Lucene 3.X

Each one of those paths will need a separate road map and plan. In
JIRA these should probably be listed as separate components, along
with more structural components like Lucene.Net Core, Lucene.Net
Contrib, Luke.Net, etc...

Assuming we are working on these in parallel, I've included some
rough estimate dates for completion of each listed milestone in
the road maps.


Project Maintenance
==

This includes the various aspects of transition from Lucene
subproject to Incubator Podling, as well as updating the website
and documenation.


Roadmap

* Website and branding Update (02/28/2011)
  - LUCENENET-379 - Clean up Lucene.Net website
* NOTE: Should probably be split into a few separate issues:
 * Update website to be Apache CMS based
 * Update website to reflect current status and information
 * New site layout
 * New logo design

* Documentation Update (03/28/2011)
  - LUCENENET-382 - Create a wiki page for Lucene.Net
* NOTE: This should probably have more detailed tasks defined
  and separately assigned / managed. This should focus on 2.9.2
  level code, examples, etc, plus FAQ, Design, and other
  similar documentation.


Catching up with the Backlog
==

This includes finalizing the 2.9.2 release, updating that to
Lucene 2.9.4 compatibility and applying outstanding patches and
bug fixes. This will put is slightly out of sync with Java Lucene
because we'll have additional patches applied that the Java Lucene
project does not have for our 2.9.4 release.


Roadmap

* Lucene.Net 2.9.2 Binary release (02/28/2011)
  - LUCENENET-381 - Official release of Lucene.Net 2.9.2
* Build from existing tag, no new changes

* Lucene.Net 2.9.4 Source/Binary release (03/28/2011)

  *EASY STUFF*
  - LUCENENET-389 - Signing the released assembly
  - LUCENENET-377 - Upgrade solution to VS2010
  - LUCENENET-361 - Workaround for a Mono C# compiler issue
  - LUCENENET-266 - Putting support classes in separate files and
in a separate directory
  - LUCENENET-337 - TokenAttribute for Selectively Including Tokens
in Length Norm
  - LUCENENET-330 - Search.Regex minimal port
  - LUCENENET-371 - Unit test for Search.Regex port
  - LUCENENET-374 - IndexReader.IsCurrent returning false
positive in some cases
  - LUCENENET-179 - SnowballFilter speed improvment

  *HARDER STUFF*
  - LUCENENET-??? Rollup changes from Lucene 2.9.3/2.9.4 releases
  - LUCENENET-372 - NLS pack for Lucene.NET: BR, CJK, CN, CZ, DE,
FR, NL, RU analyzers
* NOTE: For v1.4 This code could be a starting point for a
  2.9.2 compatible version
  - LUCENENET-391 - Luke.Net for Lucene.Net
* NOTE: For v1.4 This code could be a starting point for a
  2.9.2 compatible version
  - LUCENENET-172 - This patch fixes the unexceptional exceptions
ecountered in FastCharStream and SupportClass
* NOTE: Evaluate concerns expressed by George A. for this patch
  - LUCENENET-167 - Compact Framework  Silverlight Support
* NOTE: Evaluate required steps and impact this will have on
  source code. Perhaps create a branch for CF/SilverLight.
  - LUCENENET-378 - Objects with a Close method should support
IDisposable
* NOTE: Significant diversion from Java, involves a lot of
code-touch. Maybe take some ideas from and/or incorporate
changes from various CodeProject forks?


Working on a New Porting System
==

We've discussed that we'd like to fully automate this process and
so far, the most obvious tool to use is Sharpen. This may involve
forking Sharpen (or contributing back to that project if
appropriate). We also discussed we'd like a to set up a CI server
for this work (and other things).

* Evaluate tooling (02/28/2011)
  - LUCENENET-380 - Evaluate Sharpen as a port tool
* NOTE: We should conclusively complete the evaluation and if
we're ok with Sharpen, close this issue and move on to building
a production version of the Sharpen code.

  - LUCENENET-??? - Evalute CI systems and build a proposal for
CI server setup

* Create Production System, 2.9.2 compatible (03/28/2011)
  - LUCENENET-??? - Create production version of automated port
scripts for 2.9.2 build
* NOTE: This will allow us to focus on the conversion process,
  not the Lucene.Net code changes. This will be considered
  complete when we are able to create a functionally equivalent
  2.9.2 port using Sharpen. This can be measured using existing
  unit tests, or by adding new ones as needed

RE: Proposed Roadmap

2011-02-18 Thread Prescott Nasser

Do you imagine us doing all the catching up on backlog by hand? And then 
later getting the automated conversion out?




 From: thowar...@gmail.com
 Date: Fri, 18 Feb 2011 01:36:00 -0800
 Subject: Proposed Roadmap
 To: lucene-net-dev@lucene.apache.org

 All,

 Following up on Scott's post asking about JIRA issues and our
 development road map, I've put together a more detailed idea of how
 we could divide work, schedule releases, and clean up the backlog.

 There will be at least four main areas of work to address in the
 upcoming months:
 - Project Maintenance
 - Catching up with the backlog
 - Working on a new porting system
 - The Future: New API and Lucene 3.X

 Each one of those paths will need a separate road map and plan. In
 JIRA these should probably be listed as separate components, along
 with more structural components like Lucene.Net Core, Lucene.Net
 Contrib, Luke.Net, etc...

 Assuming we are working on these in parallel, I've included some
 rough estimate dates for completion of each listed milestone in
 the road maps.


 Project Maintenance
 ==

 This includes the various aspects of transition from Lucene
 subproject to Incubator Podling, as well as updating the website
 and documenation.


 Roadmap

 * Website and branding Update (02/28/2011)
 - LUCENENET-379 - Clean up Lucene.Net website
 * NOTE: Should probably be split into a few separate issues:
 * Update website to be Apache CMS based
 * Update website to reflect current status and information
 * New site layout
 * New logo design

 * Documentation Update (03/28/2011)
 - LUCENENET-382 - Create a wiki page for Lucene.Net
 * NOTE: This should probably have more detailed tasks defined
 and separately assigned / managed. This should focus on 2.9.2
 level code, examples, etc, plus FAQ, Design, and other
 similar documentation.


 Catching up with the Backlog
 ==

 This includes finalizing the 2.9.2 release, updating that to
 Lucene 2.9.4 compatibility and applying outstanding patches and
 bug fixes. This will put is slightly out of sync with Java Lucene
 because we'll have additional patches applied that the Java Lucene
 project does not have for our 2.9.4 release.


 Roadmap

 * Lucene.Net 2.9.2 Binary release (02/28/2011)
 - LUCENENET-381 - Official release of Lucene.Net 2.9.2
 * Build from existing tag, no new changes

 * Lucene.Net 2.9.4 Source/Binary release (03/28/2011)

 *EASY STUFF*
 - LUCENENET-389 - Signing the released assembly
 - LUCENENET-377 - Upgrade solution to VS2010
 - LUCENENET-361 - Workaround for a Mono C# compiler issue
 - LUCENENET-266 - Putting support classes in separate files and
 in a separate directory
 - LUCENENET-337 - TokenAttribute for Selectively Including Tokens
 in Length Norm
 - LUCENENET-330 - Search.Regex minimal port
 - LUCENENET-371 - Unit test for Search.Regex port
 - LUCENENET-374 - IndexReader.IsCurrent returning false
 positive in some cases
 - LUCENENET-179 - SnowballFilter speed improvment

 *HARDER STUFF*
 - LUCENENET-??? Rollup changes from Lucene 2.9.3/2.9.4 releases
 - LUCENENET-372 - NLS pack for Lucene.NET: BR, CJK, CN, CZ, DE,
 FR, NL, RU analyzers
 * NOTE: For v1.4 This code could be a starting point for a
 2.9.2 compatible version
 - LUCENENET-391 - Luke.Net for Lucene.Net
 * NOTE: For v1.4 This code could be a starting point for a
 2.9.2 compatible version
 - LUCENENET-172 - This patch fixes the unexceptional exceptions
 ecountered in FastCharStream and SupportClass
 * NOTE: Evaluate concerns expressed by George A. for this patch
 - LUCENENET-167 - Compact Framework  Silverlight Support
 * NOTE: Evaluate required steps and impact this will have on
 source code. Perhaps create a branch for CF/SilverLight.
 - LUCENENET-378 - Objects with a Close method should support
 IDisposable
 * NOTE: Significant diversion from Java, involves a lot of
 code-touch. Maybe take some ideas from and/or incorporate
 changes from various CodeProject forks?


 Working on a New Porting System
 ==

 We've discussed that we'd like to fully automate this process and
 so far, the most obvious tool to use is Sharpen. This may involve
 forking Sharpen (or contributing back to that project if
 appropriate). We also discussed we'd like a to set up a CI server
 for this work (and other things).

 * Evaluate tooling (02/28/2011)
 - LUCENENET-380 - Evaluate Sharpen as a port tool
 * NOTE: We should conclusively complete the evaluation and if
 we're ok with Sharpen, close this issue and move on to building
 a production version of the Sharpen code.

 - LUCENENET-??? - Evalute CI systems and build a proposal for
 CI server setup

 * Create Production System, 2.9.2 compatible (03/28/2011)
 - LUCENENET-??? - Create production version of automated port
 scripts for 2.9.2 build
 * NOTE: This will allow us to focus on the conversion process,
 not the Lucene.Net code changes

RE: Proposed Roadmap

2011-02-18 Thread Lombard, Scott
I will take the lead on setting up the JIRA.  I will follow Troy's proposed 
roadmap and integrate changes as they come up.


Scott

-Original Message-
From: Troy Howard [mailto:thowar...@gmail.com]
Sent: Friday, February 18, 2011 4:48 AM
To: lucene-net-...@lucene.apache.org
Cc: Prescott Nasser
Subject: Re: Proposed Roadmap

Yes. If you look at the milestone dates on those two tracks of development:

Lucene.Net 2.9.2 Binary release (02/28/2011)
Create Production System, 2.9.2 compatible (03/28/2011)
Lucene.Net 2.9.4 Source/Binary release (03/28/2011)
Create Production System, 2.9.4 compatible (04/25/2011)

The porting scripts are scheduled to lag behind the manual updates by
one month. It seemed that this would be the most efficient way to get
caught up quickly, while still moving forward with our long term
goals.

Thanks,
Troy


On Fri, Feb 18, 2011 at 1:45 AM, Prescott Nasser geobmx...@hotmail.com wrote:

 Do you imagine us doing all the catching up on backlog by hand? And then 
 later getting the automated conversion out?



 
 From: thowar...@gmail.com
 Date: Fri, 18 Feb 2011 01:36:00 -0800
 Subject: Proposed Roadmap
 To: lucene-net-...@lucene.apache.org

 All,

 Following up on Scott's post asking about JIRA issues and our
 development road map, I've put together a more detailed idea of how
 we could divide work, schedule releases, and clean up the backlog.

 There will be at least four main areas of work to address in the
 upcoming months:
 - Project Maintenance
 - Catching up with the backlog
 - Working on a new porting system
 - The Future: New API and Lucene 3.X

 Each one of those paths will need a separate road map and plan. In
 JIRA these should probably be listed as separate components, along
 with more structural components like Lucene.Net Core, Lucene.Net
 Contrib, Luke.Net, etc...

 Assuming we are working on these in parallel, I've included some
 rough estimate dates for completion of each listed milestone in
 the road maps.


 Project Maintenance
 ==

 This includes the various aspects of transition from Lucene
 subproject to Incubator Podling, as well as updating the website
 and documenation.


 Roadmap

 * Website and branding Update (02/28/2011)
 - LUCENENET-379 - Clean up Lucene.Net website
 * NOTE: Should probably be split into a few separate issues:
 * Update website to be Apache CMS based
 * Update website to reflect current status and information
 * New site layout
 * New logo design

 * Documentation Update (03/28/2011)
 - LUCENENET-382 - Create a wiki page for Lucene.Net
 * NOTE: This should probably have more detailed tasks defined
 and separately assigned / managed. This should focus on 2.9.2
 level code, examples, etc, plus FAQ, Design, and other
 similar documentation.


 Catching up with the Backlog
 ==

 This includes finalizing the 2.9.2 release, updating that to
 Lucene 2.9.4 compatibility and applying outstanding patches and
 bug fixes. This will put is slightly out of sync with Java Lucene
 because we'll have additional patches applied that the Java Lucene
 project does not have for our 2.9.4 release.


 Roadmap

 * Lucene.Net 2.9.2 Binary release (02/28/2011)
 - LUCENENET-381 - Official release of Lucene.Net 2.9.2
 * Build from existing tag, no new changes

 * Lucene.Net 2.9.4 Source/Binary release (03/28/2011)

 *EASY STUFF*
 - LUCENENET-389 - Signing the released assembly
 - LUCENENET-377 - Upgrade solution to VS2010
 - LUCENENET-361 - Workaround for a Mono C# compiler issue
 - LUCENENET-266 - Putting support classes in separate files and
 in a separate directory
 - LUCENENET-337 - TokenAttribute for Selectively Including Tokens
 in Length Norm
 - LUCENENET-330 - Search.Regex minimal port
 - LUCENENET-371 - Unit test for Search.Regex port
 - LUCENENET-374 - IndexReader.IsCurrent returning false
 positive in some cases
 - LUCENENET-179 - SnowballFilter speed improvment

 *HARDER STUFF*
 - LUCENENET-??? Rollup changes from Lucene 2.9.3/2.9.4 releases
 - LUCENENET-372 - NLS pack for Lucene.NET: BR, CJK, CN, CZ, DE,
 FR, NL, RU analyzers
 * NOTE: For v1.4 This code could be a starting point for a
 2.9.2 compatible version
 - LUCENENET-391 - Luke.Net for Lucene.Net
 * NOTE: For v1.4 This code could be a starting point for a
 2.9.2 compatible version
 - LUCENENET-172 - This patch fixes the unexceptional exceptions
 ecountered in FastCharStream and SupportClass
 * NOTE: Evaluate concerns expressed by George A. for this patch
 - LUCENENET-167 - Compact Framework  Silverlight Support
 * NOTE: Evaluate required steps and impact this will have on
 source code. Perhaps create a branch for CF/SilverLight.
 - LUCENENET-378 - Objects with a Close method should support
 IDisposable
 * NOTE: Significant diversion from Java, involves a lot of
 code-touch. Maybe take some ideas from and/or incorporate
 changes from various CodeProject forks?


 Working on a New Porting

Seeking Participants / Topics for Roadmap Panel @ Lucene Revolution

2010-08-27 Thread Chris Hostetter


Hey folks,

As you may know, we're planning on having a Roadmap Panel discussion at 
Lucene Revolution to talk about the future of Lucene/Solr.  The hope is to 
have an interesting/insightful discussion of where folks think the project 
is headed, what features we anticipate in the next few releases, etc...


http://lucenerevolution.org/Presentation-Abstracts-Day2#roadmap

So far, Yonik, Grant, McCandless, and myself have all agreed to 
participate; but any and all Lucene/Solr committers attending the 
conference are invited (and encouraged) to be on the panel as well -- just 
let me know if you are interested.


The plan is to keep things very casual.  Ideally it will just be an 
informal (and informative) discussion fueled by questions from the 
audience -- but to be safe, it seems wise to have a list of 
concepts/features people think are worth talking about that we can fall 
back on as needed.  I've started a list on the wiki, anyone with thoughts 
on what they'd like to discuss (or hear discussed) should feel free to add 
their own ideas...


  http://wiki.apache.org/lucene-java/LuceneRevolution2010

-Hoss

--
http://lucenerevolution.org/  ...  October 7-8, Boston
http://bit.ly/stump-hoss  ...  Stump The Chump!


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: [Lucy] Roadmap for first release

2010-07-30 Thread Marvin Humphrey
On Thu, Jul 29, 2010 at 08:31:27PM -0700, Mattmann, Chris A (388J) wrote:
 Thanks, Peter for the email. I kind of guessed it was more related to
 KinoSearch. 

Since Lucy is about to assimilate the KinoSearch code base, they are now
effectively the same project.  I was perusing the early days of the
SpamAssassin dev mailing list today to get a feel for how our code import
would go, and I found a thread called Where to fix the stable branch:

http://markmail.org/message/e5avbm6z2pldv55u

It addressed the problem of making a last bugfix release on GPL'd code in the
2.6 branch, when there was CLA'd code at Apache for 2.7+.  I think the
situation is analogous, and that like SpamAssassin, we will move on and leave
the issue behind with time.

 I think it would be good to keep the discussions focused on Apache over
 here, even though I'm aware of the transition as part of the podling.

I appreciate you're keeping our eyes on the prize.  I agree that completing
the transition as soon as possible is very important.  Frankly, nobody wants
this dualism to end more than I do.  

For me, this is mostly about compromise and community.  Peter is an important
contributor.  Father Chrysostomos is an important contributor.  I want them to
feel that the KinoSearch community appreciated the extensions that they wrote
and supported them.  My preferred mechanism for that would have been to fork
off Lucy1 and provide them with patches for their extensions to work within
that namespace.  But Peter, at least, feels very strongly that we ought to do
a KinoSearch release.  I'm -0.1 on the idea, but I don't think the
consequences will be severe, and I want Peter to feel like a stakeholder.

Here's the roadmap as I see things now:

KinoSearch 0.30_11
* Misc Bugfixes.
* Move some classes around.
KinoSearch 0.31
* KS 0.30_11 with a version number increment.
Lucy 0.1 
* KS 0.31 with a new namespace, new license, and a new home.
Lucy 0.2
* Introduce numeric field types.
Lucy1 1.0
* Forked from Lucy 0.2.x once things setle down.

The only significant change of plans here is the insertion of KinoSearch 0.31
into the roadmap.  The changes that will go into KS 0.30_11 are the same as
what we would have done anyway, and I believe the discussions about those
belong here, as they will directly impact the API of Lucy 0.1.  For example,
I'm about to propose moving all of our Analyzers out of core, like Lucene has.
I don't think that discussion should take place on the KS list.

I also believe that potential Mentor impatience will help keep us from getting
sidetracked and spending too much time on pre-transition changes.  :)

Marvin Humphrey



Re: [Lucy] Roadmap for first release

2010-07-30 Thread Mattmann, Chris A (388J)
Hi Marvin,

Yeah, my big concern is that you¹re talking about a project that doesn¹t
live at Apache (yet) on Apache lists. I like your timeline below, but there
are no dates behind them? How long does it take to cut a 0.31 Kinosearch
release? My hope is a few days, not a few weeks. We have Incubator reports
to file over here in Apache-land (in a few weeks), and those reports need
*Apache* milestones -- not milestones from a project that isn't at Apache.
This dualism has to stop and the development/mailing list discussion/release
cycle needs to occur over here at Apache.

Cheers,
Chris



On 7/30/10 9:55 AM, Marvin Humphrey mar...@rectangular.com wrote:

 On Thu, Jul 29, 2010 at 08:31:27PM -0700, Mattmann, Chris A (388J) wrote:
 Thanks, Peter for the email. I kind of guessed it was more related to
 KinoSearch.
 
 Since Lucy is about to assimilate the KinoSearch code base, they are now
 effectively the same project.  I was perusing the early days of the
 SpamAssassin dev mailing list today to get a feel for how our code import
 would go, and I found a thread called Where to fix the stable branch:
 
 http://markmail.org/message/e5avbm6z2pldv55u

 It addressed the problem of making a last bugfix release on GPL'd code in the
 2.6 branch, when there was CLA'd code at Apache for 2.7+.  I think the
 situation is analogous, and that like SpamAssassin, we will move on and leave
 the issue behind with time.
 
 I think it would be good to keep the discussions focused on Apache over
 here, even though I'm aware of the transition as part of the podling.
 
 I appreciate you're keeping our eyes on the prize.  I agree that completing
 the transition as soon as possible is very important.  Frankly, nobody wants
 this dualism to end more than I do.
 
 For me, this is mostly about compromise and community.  Peter is an important
 contributor.  Father Chrysostomos is an important contributor.  I want them to
 feel that the KinoSearch community appreciated the extensions that they wrote
 and supported them.  My preferred mechanism for that would have been to fork
 off Lucy1 and provide them with patches for their extensions to work within
 that namespace.  But Peter, at least, feels very strongly that we ought to do
 a KinoSearch release.  I'm -0.1 on the idea, but I don't think the
 consequences will be severe, and I want Peter to feel like a stakeholder.
 
 Here's the roadmap as I see things now:
 
 KinoSearch 0.30_11
 * Misc Bugfixes.
 * Move some classes around.
 KinoSearch 0.31
 * KS 0.30_11 with a version number increment.
 Lucy 0.1
 * KS 0.31 with a new namespace, new license, and a new home.
 Lucy 0.2
 * Introduce numeric field types.
 Lucy1 1.0
 * Forked from Lucy 0.2.x once things setle down.
 
 The only significant change of plans here is the insertion of KinoSearch 0.31
 into the roadmap.  The changes that will go into KS 0.30_11 are the same as
 what we would have done anyway, and I believe the discussions about those
 belong here, as they will directly impact the API of Lucy 0.1.  For example,
 I'm about to propose moving all of our Analyzers out of core, like Lucene has.
 I don't think that discussion should take place on the KS list.
 
 I also believe that potential Mentor impatience will help keep us from getting
 sidetracked and spending too much time on pre-transition changes.  :)
 
 Marvin Humphrey
 
 


++
Chris Mattmann, Ph.D.
Senior Computer Scientist
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 171-266B, Mailstop: 171-246
Email: chris.mattm...@jpl.nasa.gov
WWW:   http://sunset.usc.edu/~mattmann/
++
Adjunct Assistant Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++




Re: [Lucy] Roadmap for first release

2010-07-29 Thread Peter Karman
Marvin Humphrey wrote on 7/29/10 8:37 PM:
 Peter,
 
 I'm willing to go with making a KinoSearch 0.31 release.  It would also be
 nice if we could go through the various KSx extensions on CPAN (particularly
 yours and Father C's) and update them work with it.

great.

I think most of Father C's wildcard features are re-implemented in
Search::Query::Dialect::KSx which works with the latest KS. So I'm at least
familiar with his code to that extent.

 I'd also been planning on shutting down the KinoSearch lists and reworking
 rectangular.com to point everybody at Lucy hmmm... In fact, I think we
 should still do just that: shunt all attention and communication to Lucy, by
 listing the Lucy website and mailing lists in the KinoSearch documentation
 under SUPPORT.

+1

-- 
Peter Karman  .  http://peknet.com/  .  pe...@peknet.com


Re: [Lucy] Roadmap for first release

2010-07-29 Thread Mattmann, Chris A (388J)
Guys:

Can you enlighten me as to what this has to do with *Apache* Lucy? And
furthermore, what it has to do with the *Incubator podling*?

Thanks,
Chris



On 7/29/10 6:40 PM, Peter Karman pe...@peknet.com wrote:

 Marvin Humphrey wrote on 7/29/10 8:37 PM:
 Peter,
 
 I'm willing to go with making a KinoSearch 0.31 release.  It would also be
 nice if we could go through the various KSx extensions on CPAN (particularly
 yours and Father C's) and update them work with it.
 
 great.
 
 I think most of Father C's wildcard features are re-implemented in
 Search::Query::Dialect::KSx which works with the latest KS. So I'm at least
 familiar with his code to that extent.
 
 I'd also been planning on shutting down the KinoSearch lists and reworking
 rectangular.com to point everybody at Lucy hmmm... In fact, I think we
 should still do just that: shunt all attention and communication to Lucy, by
 listing the Lucy website and mailing lists in the KinoSearch documentation
 under SUPPORT.
 
 +1
 
 --
 Peter Karman  .  http://peknet.com/  .  pe...@peknet.com
 


++
Chris Mattmann, Ph.D.
Senior Computer Scientist
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 171-266B, Mailstop: 171-246
Email: chris.mattm...@jpl.nasa.gov
WWW:   http://sunset.usc.edu/~mattmann/
++
Adjunct Assistant Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++




Re: [Lucy] Roadmap for first release

2010-07-29 Thread Peter Karman
Mattmann, Chris A (388J) wrote on 7/29/10 9:07 PM:
 Guys:
 
 Can you enlighten me as to what this has to do with *Apache* Lucy? And
 furthermore, what it has to do with the *Incubator podling*?
 

Chris,

This thread veered off into the future of KS vis-a-vis Lucy, and probably could
have been moved to the KS list. It's related to Apache Lucy and the Incubator
podling in terms of (a) the feature set that Apache Lucy 1.0 will have, and (b)
the strategy for how to encourage the existing KS community to migrate to Lucy.

But I think the thread is done, unless Marvin has more to add.

pek
-- 
Peter Karman  .  http://peknet.com/  .  pe...@peknet.com


Re: [Lucy] Roadmap for first release

2010-07-23 Thread Peter Karman
Marvin Humphrey wrote on 7/23/10 3:27 PM:
 On Fri, Jul 23, 2010 at 11:00:58AM -0500, Peter Karman wrote:
 those all sound good for Lucy. Should not impede the KS3 release though. I
 imagine Lucy1 as an improvement on KS3, inspiring users to migrate.
 
 Forking and releasing KS3 is not a huge development burden in the grand scheme
 of things, but I'm not sure it's wise from a marketing perspective.

I think a KS release does two things:

(1) it says to the existing community that they have not spent years waiting in
vain for a stable, production-ready, blessed release of KS, and
(2) it is consistent with release early and often.

You wrote earlier in this thread:

 It probably makes sense to make one more KinoSearch release
 addressing some of the issues for the transition.

I am agreeing with that. I think it should either be KinoSearch3, or KinoSearch
0.3, I don't care which. KinoSearch 0.3 seems like it would be less confusing.

I don't think a stable KS release undermines Lucy's marketing efforts. Working,
stable code is a good thing. The KS release could clearly state in its
documentation that it represents several years of development effort,
culminating in a final, stable release of the mmap-based index format, and that
future development will be on Lucy. Anyone who has paid a minute's attention to
the KS project over the last few years knows about this Lucy thing but the
timing and schedule has all been a bit hand-wavey. So there's no surprise here;
instead there are definites: a final stable KS release, and adoption by the
Apache Incubator of the Lucy fork of KS.

We could make a final KS release tomorrow, handshakes and champagne all around,
and move on.

 
 From my perspective, the sooner that KinoSearch gets EOL'd and we can focus on
 Lucy development in earnest, the better.  I wouldn't mind having some extra
 pressure to release Lucy1 ASAP because we didn't release KinoSearch3.  :)

I agree with that first sentence. I think we're negotiating what EOL'd means.
I think it should mean a 0.3 release, with no more releases except for security
fixes.

-- 
Peter Karman  .  http://peknet.com/  .  pe...@peknet.com


Roadmap for first release

2010-07-18 Thread Marvin Humphrey
Greets,

At the risk of counting our chickens before they hatch, the proposal to
assimilate the KinoSearch codebase and the move to the Incubator seems likely
to pass the lazy consensus vote now underway.  :)  It's time to start thinking
about what ought to be in Lucy's first release.

I propose a minimalist strategy that will allow us to get to a release as
quickly as possible:

  1. Branch Lucy off the last KS bugfix release rather than svn trunk.
  2. Perform IP clearance and relicensing.
  3. Perform a few massive find and replace operations to change the imported
 codebase to Lucy.
  4. Add code to enable Lucy to read existing KinoSearch 0.3x indexes.
  5. Consider moving a few classes around.
  6. Write a Lucy::Docs::KinoSearch2Lucy transition guide and a
 kinosearch2lucy.pl tool to adapt user codebases using 0.3x
 automatically.

A fair amount of work has been done on KinoSearch's svn trunk since the last
bugfix release, but it won't be hard to reproduce that work, and there aren't
any IP issues that require those commits to go through IP clearance.  By going
with code that has already lived in production environments, we give ourselves
a better chance at making a good first impression via code that just works
for new users.  New users will surely mean new bug reports and we should plan
to make Lucy bugfix releases, but hopefully by minimizing churn we will make
it possible to focus on user support and evangelizing in the wake of the
initial release rather than bughunting.

With regards to moving a few classes around... To paraphrase Yonik Seely,
every time you change an API, you destroy part of the community's collective
memory.  There are some migrations that were already underway, such as moving
Similarity out of Search and underneath Index.  There are other migrations we
should consider now, such as moving Stemmer outside of core, to
LucyX::Analysis::SnowballStemmer.  IMO, it would be better to complete such
moves prior to the first Lucy release, rather than destroy collective memory
later.

It probably makes sense to make one more KinoSearch release addressing some of
the issues for the transition.  In particular, in svn trunk, FullTextType and
StringType have been consolidated into TextType.  It would be nice if new Lucy
users never had to think about distinguishing between FullTextType and
StringType, since they're going away anyway.

I think we draw the line at moving classes around, though.  There are a number
of other issues that need to be addressed before we fork off a stable Lucy1
branch.  For instance, I think multi-stream posting files should be a blocker
issue for Lucy1 because of the search-time performance implications, and
there's been a fair amount of work done in svn trunk towards resolving that
issue.  However, I think we should take Hoss's advice into account when
scheduling such issues:

http://markmail.org/message/f6fda4xprom5marg

that was the the hardest thing for me to wrap my head arround when Solr
was incubating -- in many ways i was actively trying to keep Solr a
secret until i felt like it was ready to be unvield but that's not
what incubation is about, and it's really teh antithesis of how to have
asuccessful project -- you don't get a lot of contributors all at once by
saying here it is, we've got something that's stable and solid and
'done', who wants to come be a part of it? .. you get contributors slowly
and surely by saying here's what we've got so far, who wants to help us
make this better? 

I think if we market the initial Lucy release as help us get to a stable
release, then A) people will be more forgiving of our work in progress, and
B) we may attract more contributors.  We can put off the more disruptive work
until later.

Sound like a plan?

Marvin Humphrey



Re: SolrCloud integration roadmap

2010-06-01 Thread olivier sallou
Well,
I would be glad to help on this feature. In very short term, I must first go
on my prototype without Cloud features, but after that I would enjoy to help
(could also help for testing).
I think I will first take a snapshot from trunk when available and test it
on my platform where I can easily set-up multiple virtual servers to test.

Olivier

2010/5/30 Simon Willnauer simon.willna...@googlemail.com

 On Sun, May 30, 2010 at 6:03 PM, olivier sallou
 olivier.sal...@gmail.com wrote:
  Hi,
  I'd like to know when SolrCloud feature will be released in Solr ? I saw
 a
  Jira track about this to integrate in trunk but I cannot see related
  roadmap.
 The patch might be integrated into trunk shortly I assume - a release
 isn't that near right now for various reasons.
 If you really need this feature you should probably join and help
 pushing it forwards, there is lots of work to do on the indexing side
 of things.

 simon
  I definitly need this feature I was going to develop myself as an
 additional
  layer above Solr (at least partially for my needs) just before reading a
  wiki article about it.
 
  Regards
 
  Olivier
 

 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org




Re: SolrCloud integration roadmap

2010-06-01 Thread Simon Willnauer
Oliver, you can already get a snapshot from the branch

https://svn.apache.org/repos/asf/lucene/solr/branches/cloud/

and see the wiki for details: http://wiki.apache.org/solr/SolrCloud

simon

On Tue, Jun 1, 2010 at 9:06 AM, olivier sallou olivier.sal...@gmail.com wrote:
 Well,
 I would be glad to help on this feature. In very short term, I must first go
 on my prototype without Cloud features, but after that I would enjoy to help
 (could also help for testing).
 I think I will first take a snapshot from trunk when available and test it
 on my platform where I can easily set-up multiple virtual servers to test.

 Olivier

 2010/5/30 Simon Willnauer simon.willna...@googlemail.com

 On Sun, May 30, 2010 at 6:03 PM, olivier sallou
 olivier.sal...@gmail.com wrote:
  Hi,
  I'd like to know when SolrCloud feature will be released in Solr ? I saw
  a
  Jira track about this to integrate in trunk but I cannot see related
  roadmap.
 The patch might be integrated into trunk shortly I assume - a release
 isn't that near right now for various reasons.
 If you really need this feature you should probably join and help
 pushing it forwards, there is lots of work to do on the indexing side
 of things.

 simon
  I definitly need this feature I was going to develop myself as an
  additional
  layer above Solr (at least partially for my needs) just before reading a
  wiki article about it.
 
  Regards
 
  Olivier
 

 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org




-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: SolrCloud integration roadmap

2010-05-30 Thread Simon Willnauer
On Sun, May 30, 2010 at 6:03 PM, olivier sallou
olivier.sal...@gmail.com wrote:
 Hi,
 I'd like to know when SolrCloud feature will be released in Solr ? I saw a
 Jira track about this to integrate in trunk but I cannot see related
 roadmap.
The patch might be integrated into trunk shortly I assume - a release
isn't that near right now for various reasons.
If you really need this feature you should probably join and help
pushing it forwards, there is lots of work to do on the indexing side
of things.

simon
 I definitly need this feature I was going to develop myself as an additional
 layer above Solr (at least partially for my needs) just before reading a
 wiki article about it.

 Regards

 Olivier


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: SolrCloud integration roadmap

2010-05-30 Thread Mark Miller

On 5/30/10 6:03 PM, olivier sallou wrote:

Hi,
I'd like to know when SolrCloud feature will be released in Solr ? I saw
a Jira track about this to integrate in trunk but I cannot see related
roadmap.
I definitly need this feature I was going to develop myself as an
additional layer above Solr (at least partially for my needs) just
before reading a wiki article about it.

Regards

Olivier


Hey Olivier - Just got back from vacation and I hope to get the first 
phase of SolrCloud committed to trunk very soon - I'll see if I can't 
make it happen this week.


--
- Mark

http://www.lucidimagination.com

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Nutch 2.0 roadmap

2010-04-08 Thread Doğacan Güney
On Wed, Apr 7, 2010 at 20:32, Andrzej Bialecki a...@getopt.org wrote:
 On 2010-04-07 18:54, Doğacan Güney wrote:
 Hey everyone,

 On Tue, Apr 6, 2010 at 20:23, Andrzej Bialecki a...@getopt.org wrote:
 On 2010-04-06 15:43, Julien Nioche wrote:
 Hi guys,

 I gather that we'll jump straight to  2.0 after 1.1 and that 2.0 will be
 based on what is currently referred to as NutchBase. Shall we create a
 branch for 2.0 in the Nutch SVN repository and have a label accordingly for
 JIRA so that we can file issues / feature requests on 2.0? Do you think 
 that
 the current NutchBase could be used as a basis for the 2.0 branch?

 I'm not sure what is the status of the nutchbase - it's missed a lot of
 fixes and changes in trunk since it's been last touched ...


 I know... But I still intend to finish it, I just need to schedule
 some time for it.

 My vote would be to go with nutchbase.

 Hmm .. this puzzles me, do you think we should port changes from 1.1 to
 nutchbase? I thought we should do it the other way around, i.e. merge
 nutchbase bits to trunk.


Hmm, I am a bit out of touch with the latest changes but I know that
the differences
between trunk and nutchbase are unfortunately rather large right now.
If merging nutchbase
back into trunk would be easier then sure, let's do that.


 * support for HBase : via ORM or not (see
 NUTCH-808https://issues.apache.org/jira/browse/NUTCH-808
 )

 This IMHO is promising, this could open the doors to small-to-medium
 installations that are currently too cumbersome to handle.


 Yeah, there is already a simple ORM within nutchbase that is
 avro-based and should
 be generic enough to also support MySQL, cassandra and berkeleydb. But
 any good ORM will
 be a very good addition.

 Again, the advantage of DataNucleus is that we don't have to handcraft
 all the mid- to low-level mappings, just the mid-level ones (JOQL or
 whatever) - the cost of maintenance is lower, and the number of backends
 that are supported out of the box is larger. Of course, this is just
 IMHO - we won't know for sure until we try to use both your custom ORM
 and DataNucleus...

I am obviously a bit biased here but I have no strong feelings really.
DataNucleus
is an excellent project. What I like about avro-based approach is the
essentially free
MapReduce support we get and the fact that supporting another language
is easy. So,
we can expose partial hbase data through a server and a python-client
can easily read/write to it, thanks
to avro. That being said, I am all for DataNucleus or something else.


 --
 Best regards,
 Andrzej Bialecki     
  ___. ___ ___ ___ _ _   __
 [__ || __|__/|__||\/|  Information Retrieval, Semantic Web
 ___|||__||  \|  ||  |  Embedded Unix, System Integration
 http://www.sigram.com  Contact: info at sigram dot com





-- 
Doğacan Güney


Re: Nutch 2.0 roadmap

2010-04-08 Thread Doğacan Güney
Hi,

On Wed, Apr 7, 2010 at 21:19, MilleBii mille...@gmail.com wrote:
 Just a question ?
 Will the new HBase implementation allow more sophisticated crawling
 strategies than the current score based.

 Give you a few  example of what I'd like to do :
 Define different crawling frequency for different set of URLs, say
 weekly for some url, monthly or more for others.

 Select URLs to re-crawl based on attributes previously extracted.Just
 one example: recrawl urls that contained a certain keyword (or set of)

 Select URLs that have not yet been crawled, at the frontier of the
 crawl therefore


At some point, it would be nice to change generator so that it is only a handful
of methods and a pig (or something else) script. So, we would provide
most of the functions
you may need during generation (accessing various data) but actual
generation would be a pig
process. This way, anyone can easily change generate any way they want
(even make it more jobs
than 2 if they want more complex schemes).




 2010/4/7, Doğacan Güney doga...@gmail.com:
 Hey everyone,

 On Tue, Apr 6, 2010 at 20:23, Andrzej Bialecki a...@getopt.org wrote:
 On 2010-04-06 15:43, Julien Nioche wrote:
 Hi guys,

 I gather that we'll jump straight to  2.0 after 1.1 and that 2.0 will be
 based on what is currently referred to as NutchBase. Shall we create a
 branch for 2.0 in the Nutch SVN repository and have a label accordingly
 for
 JIRA so that we can file issues / feature requests on 2.0? Do you think
 that
 the current NutchBase could be used as a basis for the 2.0 branch?

 I'm not sure what is the status of the nutchbase - it's missed a lot of
 fixes and changes in trunk since it's been last touched ...


 I know... But I still intend to finish it, I just need to schedule
 some time for it.

 My vote would be to go with nutchbase.


 Talking about features, what else would we add apart from :

 * support for HBase : via ORM or not (see
 NUTCH-808https://issues.apache.org/jira/browse/NUTCH-808
 )

 This IMHO is promising, this could open the doors to small-to-medium
 installations that are currently too cumbersome to handle.


 Yeah, there is already a simple ORM within nutchbase that is
 avro-based and should
 be generic enough to also support MySQL, cassandra and berkeleydb. But
 any good ORM will
 be a very good addition.

 * plugin cleanup : Tika only for parsing - get rid of everything else?

 Basically, yes - keep only stuff like HtmlParseFilters (probably with a
 different API) so that we can post-process the DOM created in Tika from
 whatever original format.

 Also, the goal of the crawler-commons project is to provide APIs and
 implementations of stuff that is needed for every open source crawler
 project, like: robots handling, url filtering and url normalization, URL
 state management, perhaps deduplication. We should coordinate our
 efforts, and share code freely so that other projects (bixo, heritrix,
 droids) may contribute to this shared pool of functionality, much like
 Tika does for the common need of parsing complex formats.

 * remove index / search and delegate to SOLR

 +1 - we may still keep a thin abstract layer to allow other
 indexing/search backends, but the current mess of indexing/query filters
 and competing indexing frameworks (lucene, fields, solr) should go away.
 We should go directly from DOM to a NutchDocument, and stop there.


 Agreed. I would like to add support for katta and other indexing
 backends at some point but
 NutchDocument should be our canonical representation. The rest should
 be up to indexing backends.

 Regarding search - currently the search API is too low-level, with the
 custom text and query analysis chains. This needlessly introduces the
 (in)famous Nutch Query classes and Nutch query syntax limitations, We
 should get rid of it and simply leave this part of the processing to the
 search backend. Probably we will use the SolrCloud branch that supports
 sharding and global IDF.

 * new functionalities e.g. sitemap support, canonical tag etc...

 Plus a better handling of redirects, detecting duplicated sites,
 detection of spam cliques, tools to manage the webgraph, etc.


 I suppose that http://wiki.apache.org/nutch/Nutch2Architecture needs an
 update?

 Definitely. :)

 --
 Best regards,
 Andrzej Bialecki     
  ___. ___ ___ ___ _ _   __
 [__ || __|__/|__||\/|  Information Retrieval, Semantic Web
 ___|||__||  \|  ||  |  Embedded Unix, System Integration
 http://www.sigram.com  Contact: info at sigram dot com





 --
 Doğacan Güney



 --
 -MilleBii-




-- 
Doğacan Güney


Re: Nutch 2.0 roadmap

2010-04-08 Thread MilleBii
Not sure what u mean by pig script, but I'd like to be able to make a
multi-criteria selection of Url for fetching...
 The scoring method forces into a kind of mono dimensional approach
which is not really easy to deal with.

The regex filters are good but it assumes you want select URLs on data
which is in the URL... Pretty limited in fact

I basically would like to do 'content' based crawling. Say for
example: that I'm interested in topic A.
I'd'like to label URLs that match Topic A (user supplied logic).
Later on I would want to crawl topic A urls at a certain frequency
and non labeled urls for exploring in a different way.

 This looks like hard to do right now

2010/4/8, Doğacan Güney doga...@gmail.com:
 Hi,

 On Wed, Apr 7, 2010 at 21:19, MilleBii mille...@gmail.com wrote:
 Just a question ?
 Will the new HBase implementation allow more sophisticated crawling
 strategies than the current score based.

 Give you a few  example of what I'd like to do :
 Define different crawling frequency for different set of URLs, say
 weekly for some url, monthly or more for others.

 Select URLs to re-crawl based on attributes previously extracted.Just
 one example: recrawl urls that contained a certain keyword (or set of)

 Select URLs that have not yet been crawled, at the frontier of the
 crawl therefore


 At some point, it would be nice to change generator so that it is only a
 handful
 of methods and a pig (or something else) script. So, we would provide
 most of the functions
 you may need during generation (accessing various data) but actual
 generation would be a pig
 process. This way, anyone can easily change generate any way they want
 (even make it more jobs
 than 2 if they want more complex schemes).




 2010/4/7, Doğacan Güney doga...@gmail.com:
 Hey everyone,

 On Tue, Apr 6, 2010 at 20:23, Andrzej Bialecki a...@getopt.org wrote:
 On 2010-04-06 15:43, Julien Nioche wrote:
 Hi guys,

 I gather that we'll jump straight to  2.0 after 1.1 and that 2.0 will
 be
 based on what is currently referred to as NutchBase. Shall we create a
 branch for 2.0 in the Nutch SVN repository and have a label accordingly
 for
 JIRA so that we can file issues / feature requests on 2.0? Do you think
 that
 the current NutchBase could be used as a basis for the 2.0 branch?

 I'm not sure what is the status of the nutchbase - it's missed a lot of
 fixes and changes in trunk since it's been last touched ...


 I know... But I still intend to finish it, I just need to schedule
 some time for it.

 My vote would be to go with nutchbase.


 Talking about features, what else would we add apart from :

 * support for HBase : via ORM or not (see
 NUTCH-808https://issues.apache.org/jira/browse/NUTCH-808
 )

 This IMHO is promising, this could open the doors to small-to-medium
 installations that are currently too cumbersome to handle.


 Yeah, there is already a simple ORM within nutchbase that is
 avro-based and should
 be generic enough to also support MySQL, cassandra and berkeleydb. But
 any good ORM will
 be a very good addition.

 * plugin cleanup : Tika only for parsing - get rid of everything else?

 Basically, yes - keep only stuff like HtmlParseFilters (probably with a
 different API) so that we can post-process the DOM created in Tika from
 whatever original format.

 Also, the goal of the crawler-commons project is to provide APIs and
 implementations of stuff that is needed for every open source crawler
 project, like: robots handling, url filtering and url normalization, URL
 state management, perhaps deduplication. We should coordinate our
 efforts, and share code freely so that other projects (bixo, heritrix,
 droids) may contribute to this shared pool of functionality, much like
 Tika does for the common need of parsing complex formats.

 * remove index / search and delegate to SOLR

 +1 - we may still keep a thin abstract layer to allow other
 indexing/search backends, but the current mess of indexing/query filters
 and competing indexing frameworks (lucene, fields, solr) should go away.
 We should go directly from DOM to a NutchDocument, and stop there.


 Agreed. I would like to add support for katta and other indexing
 backends at some point but
 NutchDocument should be our canonical representation. The rest should
 be up to indexing backends.

 Regarding search - currently the search API is too low-level, with the
 custom text and query analysis chains. This needlessly introduces the
 (in)famous Nutch Query classes and Nutch query syntax limitations, We
 should get rid of it and simply leave this part of the processing to the
 search backend. Probably we will use the SolrCloud branch that supports
 sharding and global IDF.

 * new functionalities e.g. sitemap support, canonical tag etc...

 Plus a better handling of redirects, detecting duplicated sites,
 detection of spam cliques, tools to manage the webgraph, etc.


 I suppose that http://wiki.apache.org/nutch/Nutch2Architecture needs an
 update?

 

Re: Nutch 2.0 roadmap

2010-04-08 Thread Doğacan Güney
On Thu, Apr 8, 2010 at 21:11, MilleBii mille...@gmail.com wrote:
 Not sure what u mean by pig script, but I'd like to be able to make a
 multi-criteria selection of Url for fetching...

I mean a query language like

http://hadoop.apache.org/pig/

if we expose data correctly, then you should be able to generate on any criteria
that you want.

  The scoring method forces into a kind of mono dimensional approach
 which is not really easy to deal with.

 The regex filters are good but it assumes you want select URLs on data
 which is in the URL... Pretty limited in fact

 I basically would like to do 'content' based crawling. Say for
 example: that I'm interested in topic A.
 I'd'like to label URLs that match Topic A (user supplied logic).
 Later on I would want to crawl topic A urls at a certain frequency
 and non labeled urls for exploring in a different way.

  This looks like hard to do right now

 2010/4/8, Doğacan Güney doga...@gmail.com:
 Hi,

 On Wed, Apr 7, 2010 at 21:19, MilleBii mille...@gmail.com wrote:
 Just a question ?
 Will the new HBase implementation allow more sophisticated crawling
 strategies than the current score based.

 Give you a few  example of what I'd like to do :
 Define different crawling frequency for different set of URLs, say
 weekly for some url, monthly or more for others.

 Select URLs to re-crawl based on attributes previously extracted.Just
 one example: recrawl urls that contained a certain keyword (or set of)

 Select URLs that have not yet been crawled, at the frontier of the
 crawl therefore


 At some point, it would be nice to change generator so that it is only a
 handful
 of methods and a pig (or something else) script. So, we would provide
 most of the functions
 you may need during generation (accessing various data) but actual
 generation would be a pig
 process. This way, anyone can easily change generate any way they want
 (even make it more jobs
 than 2 if they want more complex schemes).




 2010/4/7, Doğacan Güney doga...@gmail.com:
 Hey everyone,

 On Tue, Apr 6, 2010 at 20:23, Andrzej Bialecki a...@getopt.org wrote:
 On 2010-04-06 15:43, Julien Nioche wrote:
 Hi guys,

 I gather that we'll jump straight to  2.0 after 1.1 and that 2.0 will
 be
 based on what is currently referred to as NutchBase. Shall we create a
 branch for 2.0 in the Nutch SVN repository and have a label accordingly
 for
 JIRA so that we can file issues / feature requests on 2.0? Do you think
 that
 the current NutchBase could be used as a basis for the 2.0 branch?

 I'm not sure what is the status of the nutchbase - it's missed a lot of
 fixes and changes in trunk since it's been last touched ...


 I know... But I still intend to finish it, I just need to schedule
 some time for it.

 My vote would be to go with nutchbase.


 Talking about features, what else would we add apart from :

 * support for HBase : via ORM or not (see
 NUTCH-808https://issues.apache.org/jira/browse/NUTCH-808
 )

 This IMHO is promising, this could open the doors to small-to-medium
 installations that are currently too cumbersome to handle.


 Yeah, there is already a simple ORM within nutchbase that is
 avro-based and should
 be generic enough to also support MySQL, cassandra and berkeleydb. But
 any good ORM will
 be a very good addition.

 * plugin cleanup : Tika only for parsing - get rid of everything else?

 Basically, yes - keep only stuff like HtmlParseFilters (probably with a
 different API) so that we can post-process the DOM created in Tika from
 whatever original format.

 Also, the goal of the crawler-commons project is to provide APIs and
 implementations of stuff that is needed for every open source crawler
 project, like: robots handling, url filtering and url normalization, URL
 state management, perhaps deduplication. We should coordinate our
 efforts, and share code freely so that other projects (bixo, heritrix,
 droids) may contribute to this shared pool of functionality, much like
 Tika does for the common need of parsing complex formats.

 * remove index / search and delegate to SOLR

 +1 - we may still keep a thin abstract layer to allow other
 indexing/search backends, but the current mess of indexing/query filters
 and competing indexing frameworks (lucene, fields, solr) should go away.
 We should go directly from DOM to a NutchDocument, and stop there.


 Agreed. I would like to add support for katta and other indexing
 backends at some point but
 NutchDocument should be our canonical representation. The rest should
 be up to indexing backends.

 Regarding search - currently the search API is too low-level, with the
 custom text and query analysis chains. This needlessly introduces the
 (in)famous Nutch Query classes and Nutch query syntax limitations, We
 should get rid of it and simply leave this part of the processing to the
 search backend. Probably we will use the SolrCloud branch that supports
 sharding and global IDF.

 * new functionalities e.g. sitemap support, canonical 

Re: Nutch 2.0 roadmap

2010-04-07 Thread Julien Nioche
Hi,

I'm not sure what is the status of the nutchbase - it's missed a lot of
 fixes and changes in trunk since it's been last touched ...


yes, maybe we should start the 2.0 branch from 1.1 instead
Dogacan - what do you think?

BTW I see there is now a 2.0 label under JIRA, thanks to whoever added it


 Also, the goal of the crawler-commons project is to provide APIs and
 implementations of stuff that is needed for every open source crawler
 project, like: robots handling, url filtering and url normalization, URL
 state management, perhaps deduplication. We should coordinate our
 efforts, and share code freely so that other projects (bixo, heritrix,
 droids) may contribute to this shared pool of functionality, much like
 Tika does for the common need of parsing complex formats.


definitely

 +1 - we may still keep a thin abstract layer to allow other
 indexing/search backends, but the current mess of indexing/query filters
 and competing indexing frameworks (lucene, fields, solr) should go away.
 We should go directly from DOM to a NutchDocument, and stop there.



I think that separating the parsing filters from the indexing filters can
have its merits e.g. combining the metadata generated by 2 or more different
parsing filters into a single field in the NutchDocument, keeping only a
subset of the available information etc...


 
  I suppose that http://wiki.apache.org/nutch/Nutch2Architecture needs an
  update?


Have created a new page to serve as a support for discussion :
http://wiki.apache.org/nutch/Nutch2Roadmap

julien
-- 
DigitalPebble Ltd
http://www.digitalpebble.com


Re: Nutch 2.0 roadmap

2010-04-07 Thread Doğacan Güney
Hey everyone,

On Tue, Apr 6, 2010 at 20:23, Andrzej Bialecki a...@getopt.org wrote:
 On 2010-04-06 15:43, Julien Nioche wrote:
 Hi guys,

 I gather that we'll jump straight to  2.0 after 1.1 and that 2.0 will be
 based on what is currently referred to as NutchBase. Shall we create a
 branch for 2.0 in the Nutch SVN repository and have a label accordingly for
 JIRA so that we can file issues / feature requests on 2.0? Do you think that
 the current NutchBase could be used as a basis for the 2.0 branch?

 I'm not sure what is the status of the nutchbase - it's missed a lot of
 fixes and changes in trunk since it's been last touched ...


I know... But I still intend to finish it, I just need to schedule
some time for it.

My vote would be to go with nutchbase.


 Talking about features, what else would we add apart from :

 * support for HBase : via ORM or not (see
 NUTCH-808https://issues.apache.org/jira/browse/NUTCH-808
 )

 This IMHO is promising, this could open the doors to small-to-medium
 installations that are currently too cumbersome to handle.


Yeah, there is already a simple ORM within nutchbase that is
avro-based and should
be generic enough to also support MySQL, cassandra and berkeleydb. But
any good ORM will
be a very good addition.

 * plugin cleanup : Tika only for parsing - get rid of everything else?

 Basically, yes - keep only stuff like HtmlParseFilters (probably with a
 different API) so that we can post-process the DOM created in Tika from
 whatever original format.

 Also, the goal of the crawler-commons project is to provide APIs and
 implementations of stuff that is needed for every open source crawler
 project, like: robots handling, url filtering and url normalization, URL
 state management, perhaps deduplication. We should coordinate our
 efforts, and share code freely so that other projects (bixo, heritrix,
 droids) may contribute to this shared pool of functionality, much like
 Tika does for the common need of parsing complex formats.

 * remove index / search and delegate to SOLR

 +1 - we may still keep a thin abstract layer to allow other
 indexing/search backends, but the current mess of indexing/query filters
 and competing indexing frameworks (lucene, fields, solr) should go away.
 We should go directly from DOM to a NutchDocument, and stop there.


Agreed. I would like to add support for katta and other indexing
backends at some point but
NutchDocument should be our canonical representation. The rest should
be up to indexing backends.

 Regarding search - currently the search API is too low-level, with the
 custom text and query analysis chains. This needlessly introduces the
 (in)famous Nutch Query classes and Nutch query syntax limitations, We
 should get rid of it and simply leave this part of the processing to the
 search backend. Probably we will use the SolrCloud branch that supports
 sharding and global IDF.

 * new functionalities e.g. sitemap support, canonical tag etc...

 Plus a better handling of redirects, detecting duplicated sites,
 detection of spam cliques, tools to manage the webgraph, etc.


 I suppose that http://wiki.apache.org/nutch/Nutch2Architecture needs an
 update?

 Definitely. :)

 --
 Best regards,
 Andrzej Bialecki     
  ___. ___ ___ ___ _ _   __
 [__ || __|__/|__||\/|  Information Retrieval, Semantic Web
 ___|||__||  \|  ||  |  Embedded Unix, System Integration
 http://www.sigram.com  Contact: info at sigram dot com





-- 
Doğacan Güney


  1   2   >