Github user vanzin commented on the issue:
https://github.com/apache/spark/pull/18253
You commented on my code, not on the idea. My code was hacked together
quickly, it can be cleaned up a lot. Your comments don't disprove that
separating the refactoring of the listener bus hierarchy
Github user bOOm-X commented on the issue:
https://github.com/apache/spark/pull/18253
@vanzin I pushed some comments on your code. I think that trying to keep
the exact same class hierarchy leads to a very complex code, with many
drawbacks.
---
Github user vanzin commented on the issue:
https://github.com/apache/spark/pull/18253
@bOOm-X
I pushed some code to my repo:
https://github.com/vanzin/spark/tree/SPARK-18838
Which is an attempt to do things the way I've been trying to explain. It
tries to keep
Github user vanzin commented on the issue:
https://github.com/apache/spark/pull/18253
I really dislike `WithListenerBus` - both as a name and as a concept.
There's already a `ListenerBus` trait; if it's not enough or is broken in some
way, it should be fixed, instead of being patched
Github user bOOm-X commented on the issue:
https://github.com/apache/spark/pull/18253
@vanzin
`GroupOfListener` is just a set of listeners, which handle all the metric
stuff. it allows to decouple this metric aspect from the `ListernerBus` and the
`LiveListenerBus`. And It
Github user bOOm-X commented on the issue:
https://github.com/apache/spark/pull/18253
@cloud-fan
In my opinion, having the queues indexed by a string and reflecting that in
the API is a bit too error prone in this case. What you want is to be in an
isolated queue. With this kind
Github user vanzin commented on the issue:
https://github.com/apache/spark/pull/18253
Agree that it still seems like there's too many moving parts here. I don't
see a whole lot of difference between `BusQueue` and `GroupOfListener` (which
is a weird name, btw). A queue can have a
Github user cloud-fan commented on the issue:
https://github.com/apache/spark/pull/18253
I think `BusQueue` is a good abstraction, but we can still simplify other
parts. My proposal:
```
// Do we need a better name like ListenesGroup? it's very similar to the
previous
Github user bOOm-X commented on the issue:
https://github.com/apache/spark/pull/18253
@cloud-fan I will update the PR description with some details on the
implementation
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well.
Github user cloud-fan commented on the issue:
https://github.com/apache/spark/pull/18253
The PR description is good for explaining the new behavior, but can you say
more about the implementation?
IMO, we just need to duplicate the event queue for each important listener
like
Github user bOOm-X commented on the issue:
https://github.com/apache/spark/pull/18253
@vanzin I simplify a lot the code. There are now only one implementation
for the queue and for the group of listeners. I removed the extra trait in the
listener hierarchy too.
Can you take a
Github user bOOm-X commented on the issue:
https://github.com/apache/spark/pull/18253
retest this please
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so,
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/18253
Can one of the admins verify this patch?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this
Github user bOOm-X commented on the issue:
https://github.com/apache/spark/pull/18253
@cloud-fan
There are now multiple queues. Many threads send events to a dispatcher
which copy the event in each queue. One thread per queue poll the events from
its dedicated queue and call its
Github user bOOm-X commented on the issue:
https://github.com/apache/spark/pull/18253
@vanzin I put an ArrayBlockingQueue as you wanted
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/18253
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/79524/
Test PASSed.
---
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/18253
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/18253
**[Test build #79524 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79524/testReport)**
for PR 18253 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/18253
**[Test build #79524 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79524/testReport)**
for PR 18253 at commit
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/18253
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/79517/
Test FAILed.
---
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/18253
Merged build finished. Test FAILed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/18253
**[Test build #79517 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79517/testReport)**
for PR 18253 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/18253
**[Test build #79517 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79517/testReport)**
for PR 18253 at commit
Github user vanzin commented on the issue:
https://github.com/apache/spark/pull/18253
I was out for the last couple of weeks. But, taking a quick look, I really
would prefer for you to stop trying to implement your own blocking queue
without numbers that justify the need for it.
Github user cloud-fan commented on the issue:
https://github.com/apache/spark/pull/18253
What's the processing model now? Previously it's very simple, we have an
event queue, and many threads send events to this queue, and one thread poll
events from this and call listeners. What is
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/18253
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/18253
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/79296/
Test PASSed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/18253
**[Test build #79296 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79296/testReport)**
for PR 18253 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/18253
**[Test build #79296 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79296/testReport)**
for PR 18253 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/18253
**[Test build #79271 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79271/testReport)**
for PR 18253 at commit
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/18253
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/79271/
Test FAILed.
---
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/18253
Merged build finished. Test FAILed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/18253
**[Test build #79271 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79271/testReport)**
for PR 18253 at commit
Github user cloud-fan commented on the issue:
https://github.com/apache/spark/pull/18253
retest this please
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes
Github user bOOm-X commented on the issue:
https://github.com/apache/spark/pull/18253
retest this please
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so,
Github user bOOm-X commented on the issue:
https://github.com/apache/spark/pull/18253
@cloud-fan Yes it is !
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes
Github user cloud-fan commented on the issue:
https://github.com/apache/spark/pull/18253
is this a successor of https://github.com/apache/spark/pull/16291 ?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project
Github user bOOm-X commented on the issue:
https://github.com/apache/spark/pull/18253
@vanzin : Any news ?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so,
Github user bOOm-X commented on the issue:
https://github.com/apache/spark/pull/18253
`PID USER PR NIVIRTRESSHR S %CPU %MEM TIME+ COMMAND
11312 a.prang 20 0 17.661g 1.699g 66924 S 8.3 0.7 1:53.41 java`
Ok it is good now.
Do you have an
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/18253
Merged build finished. Test FAILed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/18253
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/78624/
Test FAILed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/18253
**[Test build #78624 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/78624/testReport)**
for PR 18253 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/18253
**[Test build #78624 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/78624/testReport)**
for PR 18253 at commit
Github user bOOm-X commented on the issue:
https://github.com/apache/spark/pull/18253
please test
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/18253
Merged build finished. Test FAILed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/18253
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/78569/
Test FAILed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/18253
**[Test build #78569 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/78569/testReport)**
for PR 18253 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/18253
**[Test build #78569 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/78569/testReport)**
for PR 18253 at commit
Github user vanzin commented on the issue:
https://github.com/apache/spark/pull/18253
And idle spark-shell should be using 0% of the CPU.
```
PID USER PR NIVIRTRESSHR S %CPU %MEM TIME+ COMMAND
1304
Github user bOOm-X commented on the issue:
https://github.com/apache/spark/pull/18253
With a `LockSupport.parkNanos(1000)` instead I get:
```
PID USER PR NIVIRTRESSHR S %CPU %MEM TIME+ COMMAND
43448 a.prang 20 0 17.663g 2.378g 8 S 35.2
Github user vanzin commented on the issue:
https://github.com/apache/spark/pull/18253
To illustrate what I'm trying to say, here's what `top` says for an *idle*
spark-shell with your code:
```
PID USER PR NIVIRTRESSHR S %CPU %MEM TIME+ COMMAND
Github user vanzin commented on the issue:
https://github.com/apache/spark/pull/18253
> I change the Thread.sleep to a Thread.yield to be less agressive for the
thread unscheduling
How does that help? Now you have a thread that never sleeps. It gives up
its scheduler slot
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/18253
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/78412/
Test PASSed.
---
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/18253
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/18253
**[Test build #78412 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/78412/testReport)**
for PR 18253 at commit
Github user bOOm-X commented on the issue:
https://github.com/apache/spark/pull/18253
> Well, you could use an ArrayBlockingQueue. Then no extra object is
allocated either.
Yes I agree. But you get the synchronization too. I am still agree that it
should not have a big
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/18253
**[Test build #78412 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/78412/testReport)**
for PR 18253 at commit
Github user vanzin commented on the issue:
https://github.com/apache/spark/pull/18253
> This implementation had 2 advantages... it use much less synchronization.
Yes, but have you quantified how much you win with that? If the blocking
queue approach has enough throughput for
Github user bOOm-X commented on the issue:
https://github.com/apache/spark/pull/18253
@vanzin Ok it is ready now.
> why SynchronousListenerBus? It doesn't seem to add anything interesting
over the existing interface.
I removed it.
> the whole "group of
59 matches
Mail list logo