jerqi commented on issue #79:
URL:
https://github.com/apache/incubator-uniffle/issues/79#issuecomment-1197624531
Do you want to raise a pr?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the sp
jerqi commented on issue #76:
URL:
https://github.com/apache/incubator-uniffle/issues/76#issuecomment-1197623471
Could you share your solution? We can discuss first.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use th
colinmjj commented on issue #95:
URL:
https://github.com/apache/incubator-uniffle/issues/95#issuecomment-1197620579
> > @zuston The current implementation limit the number of connection,
because don't want too many connection established between client and shuffle
server. We also plan to i
jerqi commented on issue #95:
URL:
https://github.com/apache/incubator-uniffle/issues/95#issuecomment-1197619293
> @zuston The current implementation limit the number of connection, because
don't want too many connection established between client and shuffle server.
We also plan to improv
zuston commented on issue #95:
URL:
https://github.com/apache/incubator-uniffle/issues/95#issuecomment-1197618109
Glad to hear this. From the flame graph, due to extra memory-copy, it cost
too much time in shuffle server side.
If using the netty to directly manipulate shuffle data by
colinmjj commented on issue #95:
URL:
https://github.com/apache/incubator-uniffle/issues/95#issuecomment-1197616175
@zuston The current implementation limit the number of connection, because
don't want too many connection established between client and shuffle server.
We also plan to imp
colinmjj commented on issue #81:
URL:
https://github.com/apache/incubator-uniffle/issues/81#issuecomment-1197613441
@zuston The benchmark of blog is based on Spark 2.4.6.
If there has no random disk IO problem with ESS, Uniffle is expected has
**poor performance** than ESS
--
This is
zuston opened a new issue, #95:
URL: https://github.com/apache/incubator-uniffle/issues/95
### Motivation
Now the executor only will use the single TCP connection with the specified
shuffle server, so when multiple tasks are running concurrently, it will share
this channel. Maybe it will
zuston commented on issue #92:
URL:
https://github.com/apache/incubator-uniffle/issues/92#issuecomment-1197607466
Got it. If we have the better design on this, i think it will achieve better
performance.
--
This is an automated message from the Apache Git Service.
To respond to the messa
zuston commented on issue #81:
URL:
https://github.com/apache/incubator-uniffle/issues/81#issuecomment-1197606473
Attach the google doc about test result in our internal cluster:
https://docs.google.com/document/d/1nmHMBEaa4lHfgQkdlYokTtXt12F5vRJiQ2MCmTQbW6k/edit#heading=h.b1udpb9l28w7
jerqi commented on issue #77:
URL:
https://github.com/apache/incubator-uniffle/issues/77#issuecomment-1197605715
> K8S Deployment can't solve this issue.
@xianjingfeng I think you can continue this.
--
This is an automated message from the Apache Git Service.
To respond to the mess
colinmjj commented on issue #77:
URL:
https://github.com/apache/incubator-uniffle/issues/77#issuecomment-1197603444
> > I'm also curious why we need to modify start script?
>
> start script will process existence
Yes, this is the limit with current implementation
--
This is
colinmjj commented on issue #81:
URL:
https://github.com/apache/incubator-uniffle/issues/81#issuecomment-1197597392
@zuston You can refer this
[blog](https://cloud.tencent.com/developer/article/1943179) for the benchmark
related.
--
This is an automated message from the Apache Git Serv
colinmjj commented on issue #92:
URL:
https://github.com/apache/incubator-uniffle/issues/92#issuecomment-1197595382
The performance problem of `getBlockIdsByPartitionId` is a known issue.
With current design, blockId should be stored in shuffle server to support
features like block filte
jerqi closed issue #90: [Performance Optimization] Improve the speed of writing
index file in shuffle server
URL: https://github.com/apache/incubator-uniffle/issues/90
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
U
jerqi commented on issue #90:
URL:
https://github.com/apache/incubator-uniffle/issues/90#issuecomment-1197592448
solved by #91
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comme
zuston commented on issue #92:
URL:
https://github.com/apache/incubator-uniffle/issues/92#issuecomment-1197577668
@colinmjj
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
jerqi commented on issue #78:
URL:
https://github.com/apache/incubator-uniffle/issues/78#issuecomment-1196899262
cc @colinmjj , Do you remember our flaky metric test? I guess that it's
caused by this issue.
--
This is an automated message from the Apache Git Service.
To respond to the me
jerqi commented on issue #80:
URL:
https://github.com/apache/incubator-uniffle/issues/80#issuecomment-119698
If we want to add some interface to control shuffle server's behavior, we
should have a complete design, and we think we need detailed discussions. We
ever have similar mind in
jerqi commented on issue #80:
URL:
https://github.com/apache/incubator-uniffle/issues/80#issuecomment-1196808782
Could you write a design doc's (use google doc) ? Because this issue is a
little complex.
--
This is an automated message from the Apache Git Service.
To respond to the messag
jerqi commented on issue #78:
URL:
https://github.com/apache/incubator-uniffle/issues/78#issuecomment-1196688153
> No logs, we just found this phenomenon. Maybe
`org.apache.uniffle.common.rpc.MonitoringServerCall#close` not called
sometimes. I try to call `decCounter` in
`MonitoringServer
jerqi commented on issue #77:
URL:
https://github.com/apache/incubator-uniffle/issues/77#issuecomment-1196682268
K8S Deployment can't solve this issue.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to
xianjingfeng commented on issue #79:
URL:
https://github.com/apache/incubator-uniffle/issues/79#issuecomment-1196681674
Yes, by using nohup
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the sp
xianjingfeng commented on issue #78:
URL:
https://github.com/apache/incubator-uniffle/issues/78#issuecomment-1196679903
No logs, we just found this phenomenon. Maybe
`org.apache.uniffle.common.rpc.MonitoringServerCall#close` not called
sometimes. I try to call `decCounter` in
`Monitorin
xianjingfeng commented on issue #77:
URL:
https://github.com/apache/incubator-uniffle/issues/77#issuecomment-119659
If our plan is deploy on k8s, this issue should close?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub a
xianjingfeng commented on issue #77:
URL:
https://github.com/apache/incubator-uniffle/issues/77#issuecomment-1196664044
> I'm also curious why we need to modify start script?
start script will process existence
--
This is an automated message from the Apache Git Service.
To respond
xianjingfeng commented on issue #76:
URL:
https://github.com/apache/incubator-uniffle/issues/76#issuecomment-1196662502
Yes, it is be testing in our production environment. I will watch it for a
while. If it's OK, I will create a pr
--
This is an automated message from the Apache Git Ser
xianjingfeng commented on issue #80:
URL:
https://github.com/apache/incubator-uniffle/issues/80#issuecomment-1196656326
> I understand that you need a `rolling upgrade` feature. In our plan, we
want to accomplish this feature by k8s operator. For the standalone mode, we
don't have the plan
zuston opened a new issue, #92:
URL: https://github.com/apache/incubator-uniffle/issues/92
### Background
I found when getting shuffle result, the flame graph show the method of
`getBlockIdsByPartitionId` occupy too much time.
![reliao_img_1658922962790](https://user-images.githu
colinmjj commented on issue #90:
URL:
https://github.com/apache/incubator-uniffle/issues/90#issuecomment-1196587758
@zuston good catch
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specifi
zuston opened a new issue, #90:
URL: https://github.com/apache/incubator-uniffle/issues/90
### Motivation
When I test uniffle performance, i found a huge performance drop due to the
low speed of writing index file. Flame graph attached:
![reliao_img_1658917352873](https://user-ima
smallzhongfeng closed issue #89: [Improvement] Add a load policy based on disk
performance
URL: https://github.com/apache/incubator-uniffle/issues/89
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to
smallzhongfeng commented on issue #89:
URL:
https://github.com/apache/incubator-uniffle/issues/89#issuecomment-1196532415
OK.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment
jerqi commented on issue #89:
URL:
https://github.com/apache/incubator-uniffle/issues/89#issuecomment-1196522252
> Maybe you are right, but I think we should open it up so that we can
verify this situation in more production environments.
You can turn it on when you deploy the shuffl
smallzhongfeng commented on issue #89:
URL:
https://github.com/apache/incubator-uniffle/issues/89#issuecomment-1196481405
Maybe you are right, but I think we should open it up so that we can verify
this situation in more production environments.
--
This is an automated message from the A
smallzhongfeng commented on issue #89:
URL:
https://github.com/apache/incubator-uniffle/issues/89#issuecomment-1196480313
But I think we should open it up so that we can verify this situation in
more production environments.
--
This is an automated message from the Apache Git Service.
To
jerqi commented on issue #89:
URL:
https://github.com/apache/incubator-uniffle/issues/89#issuecomment-1196467989
I think we should verify the function `HealCheck` in production environment
first before we turn it on. But there are fewer broken disk in our production
environment. I prefer u
smallzhongfeng commented on issue #89:
URL:
https://github.com/apache/incubator-uniffle/issues/89#issuecomment-1196452995
Do we need to turn this parameter HealthCheck on by default? This allows for
better screening of healthy machines. @colinmjj @jerqi
--
This is an automated message f
colinmjj commented on issue #89:
URL:
https://github.com/apache/incubator-uniffle/issues/89#issuecomment-1196418301
> @colinmjj I think you are right, but is it possible that memory is
allocated normally, but disk IO has problems?
I think you're worry about the shuffle server with ab
smallzhongfeng commented on issue #89:
URL:
https://github.com/apache/incubator-uniffle/issues/89#issuecomment-1196406719
@colinmjj I think you are right, but is it possible that memory is allocated
normally, but disk IO has problems.
--
This is an automated message from the Apache Git S
smallzhongfeng commented on issue #89:
URL:
https://github.com/apache/incubator-uniffle/issues/89#issuecomment-1196405275
I mean the shuffleServer's property isHealthy returns true by default, but
not the HealthCheck's default value.
--
This is an automated message from the Apache Git Se
colinmjj commented on issue #89:
URL:
https://github.com/apache/incubator-uniffle/issues/89#issuecomment-1196400738
@smallzhongfeng The workload of Shuffle Server depends on a lot of things,
eg, Memory, Disk IO, NetworkIO, etc. To simplify the assignment strategy,
memory is chosen as the m
jerqi commented on issue #89:
URL:
https://github.com/apache/incubator-uniffle/issues/89#issuecomment-1196378640
> 1. I think this scheme can only support MEMORY LOCALFILE.
> 2. Since this HealthCheck collects the information of the local disk, we
can use this feature. This health check
smallzhongfeng commented on issue #89:
URL:
https://github.com/apache/incubator-uniffle/issues/89#issuecomment-1196369819
1. I think this scheme can only support MEMORY LOCALFILE.
2. Since this HealthCheck collects the information of the local disk, we can
use this feature. This health c
44 matches
Mail list logo