On Thu, Oct 31, 2019 at 4:30 PM Sean Owen wrote:
>
> . But it'd be cooler to call these major
> releases!
Maybe this is just semantics, but my point is the Scala project
already does call 2.12 to 2.13 a major release
e.g. from https://www.scala-lang.org/download/
"Note that different *major* r
On Wed, Oct 30, 2019 at 5:57 PM Sean Owen wrote:
> Or, frankly, maybe Scala should reconsider the mutual incompatibility
> between minor releases. These are basically major releases, and
> indeed, it causes exactly this kind of headache.
>
Not saying binary incompatibility is fun, but 2.12 to 2
To be more explicit, the easiest thing to do in the short term is use
your own instance of KafkaConsumer to get the offsets for the
timestamps you're interested in, using offsetsForTimes, and use those
for the start / end offsets. See
https://kafka.apache.org/10/javadoc/?org/apache/kafka/clients/c
I feel like I've already said my piece on
https://github.com/apache/spark/pull/22138 let me know if you have
more questions.
As for SS in general, I don't have a production SS deployment, so I'm
less comfortable with reviewing large changes to it. But if no other
committers are working on it...
, Nov 22, 2018 at 7:32 PM Matei Zaharia wrote:
>
> Can we start by just recommending to contributors that they do this manually?
> Then if it seems to work fine, we can try to automate it.
>
> > On Nov 22, 2018, at 4:40 PM, Cody Koeninger wrote:
> >
> > I believe s
On Thu, Nov 22, 2018 at 9:11 AM Cody Koeninger wrote:
>>
>> Plugin invocation is ./build/mvn mvn-scalafmt_2.12:format
>>
>> It takes about 5 seconds, and errors out on the first different file
>> that doesn't match formatting.
>>
>> I made a shel
ff, seems worth a shot. What's the invocation that Shane
> could add (after this change goes in)
> On Wed, Nov 21, 2018 at 3:27 PM Cody Koeninger wrote:
> >
> > There's a mvn plugin (sbt as well, but it requires sbt 1.0+) so it
> > should be runnable from the PR builder
ad strokes but not in the details.
> Is this something that can be just run in the PR builder? if the rules
> are simple and not too hard to maintain, seems like a win.
> On Wed, Nov 21, 2018 at 2:26 PM Cody Koeninger wrote:
> >
> > Definitely not suggesting a mass reformat
so it's inevitable.
>
> Is there a way to just check style on PR changes? that's fine.
> On Wed, Nov 21, 2018 at 11:40 AM Cody Koeninger wrote:
> >
> > Is there any appetite for revisiting automating formatting?
> >
> > I know over the years various p
Is there any appetite for revisiting automating formatting?
I know over the years various people have expressed opposition to it
as unnecessary churn in diffs, but having every new contributor
greeted with "nit: 4 space indentation for argument lists" isn't very
welcoming.
---
Anastasios it looks like you already identified the two lines that
need to change, the string interpolation that depends on
UUID.randomUUID and metadataPath.hashCode.
I'd factor that out into a function that returns the group id. That
function would also need to take the "parameters" variable (th
Am I the only one for whom the livestream link didn't work last time?
Would like to be able to at least watch the discussion this time
around.
On Tue, Nov 13, 2018 at 6:01 PM Ryan Blue wrote:
>
> Hi everyone,
> I just wanted to send out a reminder that there’s a DSv2 sync tomorrow at
> 17:00 PST,
That sounds reasonable to me
On Fri, Nov 9, 2018 at 2:26 AM Anastasios Zouzias wrote:
>
> Hi all,
>
> I run in the following situation with Spark Structure Streaming (SS) using
> Kafka.
>
> In a project that I work on, there is already a secured Kafka setup where ops
> can issue an SSL certifica
Just got a question about this on the user list as well.
Worth removing that link to pwendell's directory from the docs?
On Sun, Jan 21, 2018 at 12:13 PM, Jacek Laskowski wrote:
> Hi,
>
> http://spark.apache.org/developer-tools.html#nightly-builds reads:
>
>> Spark nightly packages are available
+1 to Sean's comment
On Fri, Aug 31, 2018 at 2:48 PM, Reynold Xin wrote:
> Yup all good points. One way I've done it in the past is to have an appendix
> section for design sketch, as an expansion to the question "- What is new in
> your approach and why do you think it will be successful?"
>
> O
Short answer is it isn't necessary.
Long answer is that you aren't just changing from 08 to 10, you're
changing from the receiver based implementation to the direct stream.
Read these:
https://github.com/koeninger/kafka-exactly-once
http://spark.apache.org/docs/latest/streaming-kafka-0-8-integrat
According to
http://spark.apache.org/improvement-proposals.html
the shepherd should be a PMC member, not necessarily the person who
proposed the SPIP
On Tue, Jul 17, 2018 at 9:13 AM, Wenchen Fan wrote:
> I don't know an official answer, but conventionally people who propose the
> SPIP would cal
Sounds good, I'd like to add SPARK-24067 today assuming there's no objections
On Thu, May 10, 2018 at 1:22 PM, Henry Robinson wrote:
> +1, I'd like to get a release out with SPARK-23852 fixed. The Parquet
> community are about to release 1.8.3 - the voting period closes tomorrow -
> and I've test
https://issues.apache.org/jira/browse/SPARK-24067
is asking to backport a change to the 2.3 branch.
My questions
- In general are there any concerns about what qualifies for backporting?
This adds a configuration variable but shouldn't change default behavior.
- Is a separate jira + pr actuall
Congrats!
On Mon, Apr 2, 2018 at 12:28 AM, Wenchen Fan wrote:
> Hi all,
>
> The Spark PMC recently added Zhenhua Wang as a committer on the project.
> Zhenhua is the major contributor of the CBO project, and has been
> contributing across several areas of Spark for a while, focusing especially
>
er past work:
>
> - Anirudh Ramanathan (contributor to Kubernetes support)
> - Bryan Cutler (contributor to PySpark and Arrow support)
> - Cody Koeninger (contributor to streaming and Kafka support)
> - Erik Erlandson (contributor to Kubernetes support)
> - Matt Cheah (contributor to Kube
Was there any answer to my question around the effect of changes to
the sink api regarding access to underlying offsets?
On Wed, Nov 1, 2017 at 11:32 AM, Reynold Xin wrote:
> Most of those should be answered by the attached design sketch in the JIRA
> ticket.
>
> On Wed, Nov 1, 2017 at 5:29 PM De
23 AM, Suprith T Jain wrote:
> Yes I tried that. But it's not that effective.
>
> In fact kafka SimpleConsumer tries to reconnect in case of socket error
> (sendRequest method). So it ll always be twice the timeout for every window
> and for every node that is down.
>
>
Have you tried adjusting the timeout?
On Mon, Oct 16, 2017 at 8:08 AM, Suprith T Jain wrote:
> Hi guys,
>
> I have a 3 node cluster and i am running a spark streaming job. consider the
> below example
>
> /*spark-submit* --master yarn-cluster --class
> com.huawei.bigdata.spark.examples.FemaleInfo
https://issues-test.apache.org/jira/browse/SPARK-18258
On Mon, Sep 11, 2017 at 7:15 AM, Dmitry Naumenko wrote:
> Hi all,
>
> It started as a discussion in
> https://stackoverflow.com/questions/46153105/how-to-get-kafka-offsets-with-spark-structured-streaming-api.
>
> So the problem that there is
Is the Kafka 0.10 integration as stable as it is going to be, and worth
> marking as such for 2.3.0?
>
>
> On Tue, Sep 5, 2017 at 4:12 PM Cody Koeninger wrote:
>>
>> +1 to going ahead and giving a deprecation warning now
>>
>> On Tue, Sep 5, 2017 at 6:39 AM, Sean Ow
+1 to going ahead and giving a deprecation warning now
On Tue, Sep 5, 2017 at 6:39 AM, Sean Owen wrote:
> On the road to Scala 2.12, we'll need to make Kafka 0.8 support optional in
> the build, because it is not available for Scala 2.12.
>
> https://github.com/apache/spark/pull/19134 adds that
Here's the jira for upgrading to a 0.10.x point release, which is
effectively the discussion of upgrading to 0.11 now
https://issues.apache.org/jira/browse/SPARK-18057
On Tue, Sep 5, 2017 at 1:27 AM, matus.cimerman wrote:
> Hi guys,
>
> is there any plans to support Kafka 0.11 integration for Sp
Just wanted to point out that because the jira isn't labeled SPIP, it
won't have shown up linked from
http://spark.apache.org/improvement-proposals.html
On Mon, Aug 28, 2017 at 2:20 PM, Wenchen Fan wrote:
> Hi all,
>
> It has been almost 2 weeks since I proposed the data source V2 for
> discussi
Can you explain in more detail what you mean by "distribute Kafka
topics among different instances of same consumer group"?
If you're trying to run multiple streams using the same consumer
group, it's already documented that you shouldn't do that.
On Thu, Jun 8, 2017 at 12:43 AM, Rastogi, Pankaj
017 7:26 p.m., "Michael Armbrust" wrote:
>>
>> He's just suggesting that since the DataStreamWriter start() method can
>> fill in an option named "path", we should make that a synonym for "topic".
>> Then you could do something like.
>>
>>
I'm confused about what you're suggesting. Are you saying that a
Kafka sink should take a filesystem path as an option?
On Mon, May 1, 2017 at 8:52 AM, Jacek Laskowski wrote:
> Hi,
>
> I've just found out that KafkaSourceProvider supports topic option
> that sets the Kafka topic to save a DataFr
There are existing tickets on the issues around kafka versions, e.g.
https://issues.apache.org/jira/browse/SPARK-18057 that haven't gotten
any committer weigh-in on direction.
On Thu, Mar 9, 2017 at 12:52 PM, Oscar Batori wrote:
> Guys,
>
> To change the subject from meta-voting...
>
> We are doi
pen ticket with the SPIP label show it should show up
On Fri, Mar 10, 2017 at 11:19 AM, Reynold Xin wrote:
> We can just start using spip label and link to it.
>
>
>
> On Fri, Mar 10, 2017 at 9:18 AM, Cody Koeninger wrote:
>>
>> So to be clear, if I translate that go
ins
> can make a new issue type unfortunately. We may just have to mention a
> convention involving title and label or something.
>
> On Fri, Mar 10, 2017 at 4:52 PM Cody Koeninger wrote:
>>
>> I think it ought to be its own page, linked from the more / community
>> menu dr
I think it ought to be its own page, linked from the more / community
menu dropdowns.
We also need the jira tag, and for the page to clearly link to filters
that show proposed / completed SPIPs
On Fri, Mar 10, 2017 at 3:39 AM, Sean Owen wrote:
> Alrighty, if nobody is objecting, and nobody calls
;s a code/doc
> change we can just review and merge as usual.
>
> On Tue, Mar 7, 2017 at 3:15 PM Cody Koeninger wrote:
>>
>> Another week, another ping. Anyone on the PMC willing to call a vote on
>> this?
-
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
> rb
>
> On Fri, Feb 24, 2017 at 8:28 PM, Joseph Bradley
> wrote:
>
>> The current draft LGTM. I agree some of the various concerns may need to
>> be addressed in the future, depending on how SPIPs progress in practice.
>> If others agree, let's put it t
oc looks good to me.
>>>
>>> Ryan, the role of the shepherd is to make sure that someone
>>> knowledgeable with Spark processes is involved: this person can advise
>>> on technical and procedural considerations for people outside the
>>> community. Also, if
ng what SPIP implies. It's just a process
> document.
>
> Still, a fine step IMHO.
>
> On Thu, Feb 16, 2017 at 4:22 PM Reynold Xin wrote:
>>
>> Updated. Any feedback from other community members?
>>
>>
>> On Wed, Feb 15, 2017 at 2:53 AM, Cody Koe
ure list was always above 100. Sometimes, the
>> customers are feeling frustrated when we are unable to deliver them on time
>> due to the resource limits and others. Even if they paid us billions, we
>> still need to do it phase by phase or sometimes they have to accept the
>>
;> up with a distracting long tail of half-hearted proposals.
>>
>> These rules are meant to be flexible, but the current document should be
>> clear about who is in charge of a SPIP, and the state it is currently in.
>>
>> We have had long discussions over some very imp
Congrats, glad to hear it
On Jan 24, 2017 12:47 PM, "Shixiong(Ryan) Zhu"
wrote:
> Congrats Burak & Holden!
>
> On Tue, Jan 24, 2017 at 10:39 AM, Joseph Bradley
> wrote:
>
>> Congratulations Burak & Holden!
>>
>> On Tue, Jan 24, 2017 at 10:33 AM, Dongjoon Hyun
>> wrote:
>>
>>> Great! Congratula
Totally agree with most of what Sean said, just wanted to give an
alternate take on the "maintainers" thing
On Tue, Jan 24, 2017 at 10:23 AM, Sean Owen wrote:
> There is no such list because there's no formal notion of ownership or
> access to subsets of the project. Tracking an informal notion w
mentioned above + a new one
>> w.r.t. Reynold's draft
>> <https://docs.google.com/document/d/1-Zdi_W-wtuxS9hTK0P9qb2x-nRanvXmnZ7SUi4qMljg/edit#>
>> :
>> * Reinstate the "Where" section with links to current and past SIPs
>> * Add field for stating
requirement for three +1 votes. Why
> would we not want at least three committers to think something is a good
> idea before adopting the proposal?
>
> rb
>
> On Tue, Nov 8, 2016 at 8:13 AM, Cody Koeninger wrote:
>>
>> So there are some minor things (the Where sec
Agree that frequent topic deletion is not a very Kafka-esque thing to do
On Fri, Dec 9, 2016 at 12:09 PM, Shixiong(Ryan) Zhu
wrote:
> Sean, "stress test for failOnDataLoss=false" is because Kafka consumer may
> be thrown NPE when a topic is deleted. I added some logic to retry on such
> failure,
If you want finer-grained max rate setting, SPARK-17510 got merged a
while ago. There's also SPARK-18580 which might help address the
issue of starting backpressure rate for the first batch.
On Mon, Dec 5, 2016 at 4:18 PM, Liren Ding wrote:
> Hey all,
>
> Does backressure actually work on spark
s that can generate
> RDDs from new data by running a service/thread only on the driver node (that
> is, without running a receiver on worker nodes)
>
> Thanks and regards,
> Aakash Pradeep
>
>
> On Tue, Nov 15, 2016 at 2:55 PM, Cody Koeninger wrote:
>>
>> It'
It'd probably be worth no longer marking the 0.8 interface as
experimental. I don't think it's likely to be subject to active
development at this point.
You can use the 0.8 artifact to consume from a 0.9 broker
Where are you reading documentation indicating that the direct stream
only runs on th
; I think they are open to others helping, in fact, more than one person has
> worked on the JIRA so far. And, it's been crawling really slowly and that's
> preventing adoption of Spark's new connector in secure Kafka environments.
>
> On Tue, Nov 8, 2016 at 7:59 PM, Cod
Have you asked the assignee on the Kafka jira whether they'd be
willing to accept help on it?
On Tue, Nov 8, 2016 at 5:26 PM, Mark Grover wrote:
> Hi all,
> We currently have a new direct stream connector, thanks to work by Cody and
> others on SPARK-12177.
>
> However, that can't be used in secu
t.
>>
>>
>> On Monday, November 7, 2016, Cody Koeninger wrote:
>>>
>>> Thanks for picking up on this.
>>>
>>> Maybe I fail at google docs, but I can't see any edits on the document
>>> you linked.
>>>
>>> Regarding la
anzin
>> wrote:
>>>
>>> The proposal looks OK to me. I assume, even though it's not explicitly
>>> called, that voting would happen by e-mail? A template for the
>>> proposal document (instead of just a bullet nice) would also be nice,
>>>
SPARK-17510
https://github.com/apache/spark/pull/15132
It's for allowing tweaking of rate limiting on a per-partition basis
-
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
I answered the duplicate post on the user mailing list, I'd say keep
the discussion there.
On Fri, Nov 4, 2016 at 12:14 PM, vonnagy wrote:
> Nitin,
>
> I am getting the similar issues using Spark 2.0.1 and Kafka 0.10. I have to
> jobs, one that uses a Kafka stream and one that uses just the Kafka
So concrete things people could do
- users could tag subject lines appropriately to the component they're
asking about
- contributors could monitor user@ for tags relating to components
they've worked on.
I'd be surprised if my miss rate for any mailing list questions
well-labeled as Kafka was hi
Makes sense to me.
I do wonder if e.g.
[SPARK-12345][STRUCTUREDSTREAMING][KAFKA]
is going to leave any room in the Github PR form for actual title content?
On Mon, Oct 31, 2016 at 1:37 PM, Michael Armbrust
wrote:
> I'm planning to do a little maintenance on JIRA to hopefully improve the
> visi
y mail was just to show some
> aspects from my side, so from theside of developer and person who is trying
> to help others with Spark (via StackOverflow or other ways)
>
>
> Pozdrawiam / Best regards,
>
> Tomasz
>
>
>
> Od: Cody Koeninger
>
I think only supporting 1 version of scala at any given time is not
sufficient, 2 probably is ok.
I.e. don't drop 2.10 before 2.12 is out + supported
On Tue, Oct 25, 2016 at 10:56 AM, Sean Owen wrote:
> The general forces are that new versions of things to support emerge, and
> are valuable to s
think that makes sense, I can start a
ticket.
On Thu, Oct 20, 2016 at 1:16 PM, Reynold Xin wrote:
> Seems like a good new API to add?
>
>
> On Thu, Oct 20, 2016 at 11:14 AM, Cody Koeninger wrote:
>>
>> Access to the partition ID is necessary for basically every single one
Access to the partition ID is necessary for basically every single one
of my jobs, and there isn't a foreachPartiionWithIndex equivalent.
You can kind of work around it with empty foreach after the map, but
it's really awkward to explain to people.
On Thu, Oct 20, 2016 at 12:52 PM, Reynold Xin wr
n latency ?
>>
> I think that the fact that they serve as an output trigger is a problem,
> but Structured Streaming seems to resolve this now.
>
>>
>> Thanks
>> Shivaram
>>
>> On Wed, Oct 19, 2016 at 1:29 PM, Michael Armbrust
>> wrote:
>>
Is anyone seriously thinking about alternatives to microbatches?
On Wed, Oct 19, 2016 at 2:45 PM, Michael Armbrust
wrote:
> Anything that is actively being designed should be in JIRA, and it seems
> like you found most of it. In general, release windows can be found on the
> wiki.
>
> 2.1 has a
+1 to putting docs in one clear place.
On Oct 18, 2016 6:40 AM, "Sean Owen" wrote:
> I'm OK with that. The upside to the wiki is that it can be edited directly
> outside of a release cycle. However, in practice I find that the wiki is
> rarely changed. To me it also serves as a place for informa
SPARK-17841 three line bugfix that has a week old PR
SPARK-17812 being able to specify starting offsets is a must have for
a Kafka mvp in my opinion, already has a PR
SPARK-17813 I can put in a PR for this tonight if it'll be considered
On Mon, Oct 17, 2016 at 12:28 AM, Reynold Xin wrote:
> Si
for SIP. However I think that Spark should
>> have real-time streaming support. Currently I see many posts/comments
>> that "Spark has too big latency". Spark Streaming is doing very good
>> jobs with micro-batches, however I think it is possible to add also more
>> r
I've always been confused as to why it would ever be a good idea to
put any streaming query system on the critical path for synchronous <
100msec requests. It seems to make a lot more sense to have a
streaming system do asynch updates of a store that has better latency
and quality of service char
we have run into some trouble in the past
> with some inside the ASF but essentially outside the Spark community who
> didn't like the way we were doing things.
>
> On Mon, Oct 10, 2016 at 3:53 PM, Cody Koeninger wrote:
>>
>> Apache documents say lots of confusing stuf
it confusing and can reduce contributions.
>> Although, as engineers, we believe that anything can be solved using
>> mechanical rules, in practice software development is a social process that
>> ultimately requires humans to tackle things on a case-by-case basis.
>>
>&
nd I wouldn't want to move forward if up to half of the
> community thinks it's an untenable idea.
>
> rb
>
> On Mon, Oct 10, 2016 at 12:07 PM, Cody Koeninger wrote:
>>
>> I think this is closer to a procedural issue than a code modification
>> issue, henc
ess? I
>> think restricting who can submit proposals would only undermine them by
>> pushing contributors out. Maybe I'm missing something here?
>>
>> rb
>>
>>
>>
>> On Mon, Oct 10, 2016 at 7:41 AM, Cody Koeninger
>> wrote:
>>>
>&g
submit proposals would only undermine them by
> pushing contributors out. Maybe I'm missing something here?
>
> rb
>
>
>
> On Mon, Oct 10, 2016 at 7:41 AM, Cody Koeninger wrote:
>>
>> Yes, users suggesting SIPs is a good thing and is explicitly called
ave a large effect on the goal, we should
> have it discussed when discussing the goals. In addition, while it is often
> easy to throw out completely infeasible goals, it is often much harder to
> figure out that the goals are unfeasible without fine tuning.
>
>
>
>
>
&
signer of software, I always want to
> give feedback on APIs, so I'd really like a culture of having those early.
> People don't argue about prettiness when they discuss APIs, they argue about
> the core concepts to expose in order to meet various goals, and then they
asible
>> right now? If it's infeasible, that will be discovered later during design
>> and implementation. Same thing with rejected strategies -- listing some of
>> those is definitely useful sometimes, but if you make this a *required*
>> section, people are just going
s infeasible, that will be discovered later
> during design and implementation. Same thing with rejected strategies --
> listing some of those is definitely useful sometimes, but if you make this
> a *required* section, people are just going to fill it in with bogus stuff
> (I've see
Regarding name, if the SIP overlap is a concern, we can pick a different name.
My tongue in cheek suggestion would be
Spark Lightweight Improvement process (SPARKLI)
On Sun, Oct 9, 2016 at 4:14 PM, Cody Koeninger wrote:
> So to focus the discussion on the specific strategy I'm su
step for user feedback earlier? Or are you just trying to make
> design docs for key features more visible (and their approval more formal)?
>
> BTW note that in either case, I'd like to have a template for design docs
> too, which should also include goals. I think that would
want the SIPs to be
>>> PRDs for getting some quick feedback on the goals of a feature before it is
>>> designed, or something more like full-fledged design docs (just a more
>>> visible design doc for bigger changes). I looked at Kafka's KIPs, and they
>>> actu
entails, and then we can discuss this the specific proposal as well.
>
>
> On Fri, Oct 7, 2016 at 2:29 PM, Cody Koeninger wrote:
>
>> Yeah, in case it wasn't clear, I was talking about SIPs for major
>> user-facing or cross-cutting changes, not minor feat
That's awesome Sean, very clear.
One minor thing, noncommiters can't change assigned field as far as I know.
On Oct 9, 2016 3:40 AM, "Sean Owen" wrote:
I added a variant on this text to https://cwiki.apache.org/
confluence/display/SPARK/Contributing+to+Spark#ContributingtoSpark-
ContributingtoJ
It's not about technical design disagreement as to matters of taste,
it's about familiarity with the domain. To make an analogy, it's as
if a committer in MLlib was firmly intent on, I dunno, treating a
collection of categorical variables as if it were an ordered range of
continuous variables. It
at 2:19 PM, Reynold Xin wrote:
> I think so (at least I think it is socially acceptable). Of course, use good
> judgement here :)
>
>
>
> On Sat, Oct 8, 2016 at 12:06 PM, Cody Koeninger wrote:
>>
>> So to be clear, can I go clean up the Kafka cruft?
>>
>>
So to be clear, can I go clean up the Kafka cruft?
On Sat, Oct 8, 2016 at 1:41 PM, Reynold Xin wrote:
>
> On Sat, Oct 8, 2016 at 2:09 AM, Sean Owen wrote:
>>
>>
>> - Resolve as Fixed if there's a change you can point to that resolved the
>> issue
>> - If the issue is a proper subset of another i
tributions to the attention of committers.
>
> I dunno if people think this is perhaps too complex, but at our scale I
> feel we need some kind of loose but automated system for funneling
> contributions through some kind of lifecycle. The status quo is just not
> that good (e.g
That makes sense, thanks.
One thing I've never been clear on is who should be allowed to resolve
Jiras. Can I go clean up the backlog of Kafka Jiras that weren't created
by me?
If there's an informal policy here, can we update the wiki to reflect it?
Maybe it's there already, but I didn't see it
neling contributions
> through some kind of lifecycle. The status quo is just not that good (e.g.
> 474 open PRs against Spark as of this moment).
>
> Nick
>
>
> On Fri, Oct 7, 2016 at 4:48 PM Cody Koeninger wrote:
>>
>> Matei asked:
>>
>>
>> >
s, missing features, slow reviews
> which is understandable to some extent... it is not only about Spark but
> things can be improved for sure for this project in particular as already
> stated.
>
> On Fri, Oct 7, 2016 at 11:14 PM, Cody Koeninger
> wrote:
>
>> +1 to addin
The main thing is picking up new partitions. You can't do that
without reimplementing portions of the consumer rebalance. The
low-level consumer is really low level, and the old high-level
consumer is basically broken (it might have been fixed by the time
they abandoned it, I dunno)
On Fri, Oct
So concrete problems / potential solutions:
- Technical discussion needs to be public, or you don't hear use cases
and alternative viewpoints.
Yet email communication is low-bandwidth and hard to read people's
emotions, so committers who are colocated talk and decide things.
A possible alternativ
Matei asked:
> I agree about empowering people interested here to contribute, but I'm
> wondering, do you think there are technical things that people don't want to
> work on, or is it a matter of what there's been time to do?
It's a matter of mismanagement and miscommunication.
The structur
Without a hell of a lot more work, Assign would be the only strategy usable.
On Fri, Oct 7, 2016 at 3:25 PM, Michael Armbrust wrote:
>> The implementation is totally and completely different however, in ways
>> that leak to the end user.
>
>
> Can you elaborate? Especially in the context of the i
0.10 consumers won't work on an earlier broker.
Earlier consumers will (should?) work on a 0.10 broker.
The main things earlier consumers lack from a user perspective is
support for SSL, and pre-fetching messages. The implementation is
totally and completely different however, in ways that leak
+1 to adding an SIP label and linking it from the website. I think it needs
- template that focuses it towards soliciting user goals / non goals
- clear resolution as to which strategy was chosen to pursue. I'd
recommend a vote.
Matei asked me to clarify what I meant by changing interfaces, I t
Sean, that was very eloquently put, and I 100% agree. If I ever meet
you in person, I'll buy you multiple rounds of beverages of your
choice ;)
This is probably reiterating some of what you said in a less clear
manner, but I'll throw more of my 2 cents in.
- Design.
Yes, design by committee doesn
I love Spark. 3 or 4 years ago it was the first distributed computing
environment that felt usable, and the community was welcoming.
But I just got back from the Reactive Summit, and this is what I observed:
- Industry leaders on stage making fun of Spark's streaming model
- Open source project
Totally agree that specifying the schema manually should be the
baseline. LGTM, thanks for working on it. Seems like it looks good
to others too judging by the comment on the PR that it's getting
merged to master :)
On Thu, Sep 29, 2016 at 2:13 PM, Michael Armbrust
wrote:
>> Will this be able t
Will this be able to handle projection pushdown if a given job doesn't
utilize all the columns in the schema? Or should people have a
per-job schema?
On Wed, Sep 28, 2016 at 2:17 PM, Michael Armbrust
wrote:
> Burak, you can configure what happens with corrupt records for the
> datasource using t
Regarding documentation debt, is there a reason not to deploy
documentation updates more frequently than releases? I recall this
used to be the case.
On Wed, Sep 28, 2016 at 3:35 PM, Joseph Bradley wrote:
> +1 for 4 months. With QA taking about a month, that's very reasonable.
>
> My main ask (
1 - 100 of 227 matches
Mail list logo