Re: Deprecate Schemaless Mode?

Marcus Eagan Mon, 03 Aug 2020 12:51:05 -0700

Typo*, I meant deprecate vs. remove, which obviously cannot do.

On Mon, Aug 3, 2020 at 12:05 Marcus Eagan <[email protected]> wrote:


> Furthermore, just to be clear, I opened a discussion about deprecating and
> not replacing schemaless mode for two reasons:
>
> (1) the pain it has inflicted on Solr users and reputation of Solr —
> deprecation logs speak volumes.
> (2) to get a better understanding of what engineers and others in the
> community use Schemaless for to inform the design of its replacement.
>
> At no point would I argue that a feature like Schemaless is unnecessary.
> It was the first way I used Solr (the second time around, the first time I
> tried it I built my company using Elasticsearch because of other issues). I
> am of the opinion that "Schemaless Mode" has done more harm to Solr than
> good in my limited experience with the feature. Heck, *I've only been
> consulting for a week and it has already come up*. I acknowledge a very
> small sample size.
>
> I am curious as to your thoughts on these points. There are not lots of
> people getting started with Solr today relative to the other solutions on
> the market regardless of what you might assume. I am here to see if I can
> change that through a shift in how we approach user experience and the
> knowledge requisite to operate a production cluster. I hope no one takes
> offense to me challenging how some community members think about what is a
> good feature vs what is a bad one.
>
> Marcus
>
>
>
>
> On Mon, Aug 3, 2020 at 11:44 AM Marcus Eagan <[email protected]>
> wrote:
>
>> I know a person using it in production today. It's causing problems. They
>> could abandon Solr altogether. It seems like a schema creation wizard is
>> the right getting started motion if we know that schemaless doesn't do what
>> people think it does. It's misleading. It's also a false representation of
>> how easy it is to get started when compared to other solutions on the
>> market. If schemaless is about support new use/adoption, it should actually
>> help that more than hurt it.
>>
>> That's why I raised it. Re-branding this feature is like pig-lipsticking
>> in my mind, but you all have more experience than me and are committers. I
>> will defer to you for now. I am in favor on re-naming the feature as the
>> minimum change that should happen.
>>
>> Schemaless mode makes sense in a world where schemas are largely opaque
>> like IoT-telemetry or server logs. When you are searching data primarily
>> for human consumption, I think it is just a headache in a bottle. In the
>> cases of CSV and TSV, customers know the schema. I like to approach
>> designing software such that no one ever needs to talk to me. No
>> firefighting consulting is necessary, and you can skim the docs and proceed
>> safely. I understand others may not feel that way, but it is the future of
>> software.
>>
>> I encourage everyone here to try the newer search systems that have been
>> released and are growing rapidly to inform your opinions on this topic. I
>> am doing that because it is the concrete poured to build the common ground
>> of the future.
>>
>> On Mon, Aug 3, 2020 at 11:40 AM Anshum Gupta <[email protected]>
>> wrote:
>>
>>> +1 Jason.
>>>
>>> Here's some context on how this came into being.
>>>
>>> Users find it difficult to understand and create a basic schema when
>>> just trying out Solr. This mode was supposed to help them bootstrap, and
>>> one they had a better understanding of how things worked, they'd tune it
>>> before using the schema in production.
>>> This did improve the OTB experience for new users, but a lot of people
>>> abused this convenience and used this in production causing issues.
>>>
>>> As Jason mentioned, we'd better serve our users if we left this feature
>>> for the getting started experience and add warnings (in UI and responses?)
>>> so users would know what they are doing when they take this to production.
>>>
>>> This feature isn't trappy unless people use it in ways it was not
>>> intended to be used in. We just need to warn and educate people better.
>>>
>>> On Mon, Aug 3, 2020 at 10:41 AM Jason Gerlowski <[email protected]>
>>> wrote:
>>>
>>>> > Is anyone on this list using schemaless mode in production or have
>>>> you tried to?
>>>>
>>>> Schemaless mode is one of a group of Solr features present for
>>>> convenience but not intended for production usage.  It's in the same
>>>> boat as "bin/post", and SolrCell, and others.  These features do cause
>>>> headaches when users ignore the documented restrictions and use them
>>>> for more than prototyping.  But at the same time they're super
>>>> valuable for these sort of demo-ing or getting-started use cases.  An
>>>> easy getting-started experience is important, and schemaless et al
>>>> serve a mostly positive role in that.
>>>>
>>>> I think we'd better serve our users if we left schemaless
>>>> in/undeprecated, and instead focused on making it harder to
>>>> (unknowingly) use them in ways contrary to community recommendations.
>>>> Add louder warnings in the documentation (where not already present).
>>>> Add warnings to the Solr logs the first time these features are used.
>>>> Disable them by default (where that makes sense).  Taken to the
>>>> extreme, we could even add a section into Solr's response that lists
>>>> non-production features used in serving a given request.
>>>>
>>>> There are lots of ways to address the "feature X is trappy" problem
>>>> without removing X together.
>>>>
>>>> On Mon, Aug 3, 2020 at 11:33 AM Marcus Eagan <[email protected]>
>>>> wrote:
>>>> >
>>>> > Community,
>>>> >
>>>> > There are many of us that have had to deal with the pain of managing
>>>> the schemaless mode of operation in Solr. I'm curious to get others
>>>> thoughts about how well it is working for them and if they would like to
>>>> continue to use it.
>>>> >
>>>> > I for one don't think Schemaless works as intended and favor
>>>> deprecating it and replacing it with some more usable but I am sure others
>>>> have thoughts here.
>>>> >
>>>> > Is anyone on this list using schemaless mode in production or have
>>>> you tried to?
>>>> >
>>>> > A preliminary discussion has occurred in this Jira ticket:
>>>> https://issues.apache.org/jira/browse/SOLR-14701
>>>> >
>>>> > Thank you all,
>>>> >
>>>> > Marcus Eagan
>>>> >
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: [email protected]
>>>> For additional commands, e-mail: [email protected]
>>>>
>>>>
>>>
>>> --
>>> Anshum Gupta
>>>
>>
>>
>> --
>> Marcus Eagan
>>
>>
>
> --
> Marcus Eagan
>
> --
Marcus Eagan

Re: Deprecate Schemaless Mode?

Reply via email to