[jira] [Commented] (SPARK-25355) Support --proxy-user for Spark on K8s
[ https://issues.apache.org/jira/browse/SPARK-25355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17534889#comment-17534889 ] Gabor Somogyi commented on SPARK-25355: --- I've had a deeper look and HADOOP_TOKEN_FILE_LOCATION is a generic issue together w/ --proxy-user. Fixing only the K8S side is good direction but not enough from my perspective. > Support --proxy-user for Spark on K8s > - > > Key: SPARK-25355 > URL: https://issues.apache.org/jira/browse/SPARK-25355 > Project: Spark > Issue Type: Sub-task > Components: Kubernetes, Spark Core >Affects Versions: 3.1.0 >Reporter: Stavros Kontopoulos >Assignee: Pedro Rossi >Priority: Major > Fix For: 3.1.0 > > Attachments: client.log, driver.log, screenshot-1.png, > with_proxy_extradebugLogs.log > > > SPARK-23257 adds kerberized hdfs support for Spark on K8s. A major addition > needed is the support for proxy user. A proxy user is impersonated by a > superuser who executes operations on behalf of the proxy user. More on this: > [https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/Superusers.html] > [https://github.com/spark-notebook/spark-notebook/blob/master/docs/proxyuser_impersonation.md] > This has been implemented for Yarn upstream and Spark on Mesos here: > [https://github.com/mesosphere/spark/pull/26] > [~ifilonenko] creating this issue according to our discussion. -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-25355) Support --proxy-user for Spark on K8s
[ https://issues.apache.org/jira/browse/SPARK-25355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17532191#comment-17532191 ] Gabor Somogyi commented on SPARK-25355: --- [~unamesk15] I think the only option for now is to obtain tokens externally from Spark point of view and use "spark.kubernetes.kerberos.tokenSecret..." configs. > Support --proxy-user for Spark on K8s > - > > Key: SPARK-25355 > URL: https://issues.apache.org/jira/browse/SPARK-25355 > Project: Spark > Issue Type: Sub-task > Components: Kubernetes, Spark Core >Affects Versions: 3.1.0 >Reporter: Stavros Kontopoulos >Assignee: Pedro Rossi >Priority: Major > Fix For: 3.1.0 > > Attachments: client.log, driver.log, screenshot-1.png, > with_proxy_extradebugLogs.log > > > SPARK-23257 adds kerberized hdfs support for Spark on K8s. A major addition > needed is the support for proxy user. A proxy user is impersonated by a > superuser who executes operations on behalf of the proxy user. More on this: > [https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/Superusers.html] > [https://github.com/spark-notebook/spark-notebook/blob/master/docs/proxyuser_impersonation.md] > This has been implemented for Yarn upstream and Spark on Mesos here: > [https://github.com/mesosphere/spark/pull/26] > [~ifilonenko] creating this issue according to our discussion. -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-25355) Support --proxy-user for Spark on K8s
[ https://issues.apache.org/jira/browse/SPARK-25355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17532135#comment-17532135 ] Gabor Somogyi commented on SPARK-25355: --- After some playground work, code digging and your additional log analysis I see what's going on: * Spark obtains the already mentioned 3 tokens on submit side * Adds them as HADOOP_TOKEN_FILE_LOCATION to the driver * Driver starts and here comes the trick * UserGroupInformation [loads tokens|https://github.com/apache/hadoop/blob/2b9a8c1d3a2caf1e733d57f346af3ff0d5ba529c/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/UserGroupInformation.java#L740-L766] in case of loginUser creation so far before proxy user exits (actually this is UGI initialization) * Later on proxy user created w/ no tokens * Finally authentication fails on driver side because no credentials I've taken a look at the design doc found in [https://github.com/mesosphere/spark/pull/26] and it states the following: !screenshot-1.png! The bullet point 7 was maybe true for mesos in 2018 but it's not working w/ K8S now for sure. In the current Spark codebase only executors are using runAsSparkUser but driver is not (so runs as proxy user w/o tokens). So my general opinion considering the facts what we have which may change. Adding --proxy-user param for K8S was a good idea but: * either not tested on cluster at all or tested on a different execution path * tested and was working on cluster but after the merge (Mar 17, 2020) something has really changed in other parts of the code * all in all what I see is that the feature now completely broken [~pedro.rossi] any comments because according to the latest facts this is a feature blocker? > Support --proxy-user for Spark on K8s > - > > Key: SPARK-25355 > URL: https://issues.apache.org/jira/browse/SPARK-25355 > Project: Spark > Issue Type: Sub-task > Components: Kubernetes, Spark Core >Affects Versions: 3.1.0 >Reporter: Stavros Kontopoulos >Assignee: Pedro Rossi >Priority: Major > Fix For: 3.1.0 > > Attachments: client.log, driver.log, screenshot-1.png, > with_proxy_extradebugLogs.log > > > SPARK-23257 adds kerberized hdfs support for Spark on K8s. A major addition > needed is the support for proxy user. A proxy user is impersonated by a > superuser who executes operations on behalf of the proxy user. More on this: > [https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/Superusers.html] > [https://github.com/spark-notebook/spark-notebook/blob/master/docs/proxyuser_impersonation.md] > This has been implemented for Yarn upstream and Spark on Mesos here: > [https://github.com/mesosphere/spark/pull/26] > [~ifilonenko] creating this issue according to our discussion. -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-25355) Support --proxy-user for Spark on K8s
[ https://issues.apache.org/jira/browse/SPARK-25355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Somogyi updated SPARK-25355: -- Attachment: screenshot-1.png > Support --proxy-user for Spark on K8s > - > > Key: SPARK-25355 > URL: https://issues.apache.org/jira/browse/SPARK-25355 > Project: Spark > Issue Type: Sub-task > Components: Kubernetes, Spark Core >Affects Versions: 3.1.0 >Reporter: Stavros Kontopoulos >Assignee: Pedro Rossi >Priority: Major > Fix For: 3.1.0 > > Attachments: client.log, driver.log, screenshot-1.png, > with_proxy_extradebugLogs.log > > > SPARK-23257 adds kerberized hdfs support for Spark on K8s. A major addition > needed is the support for proxy user. A proxy user is impersonated by a > superuser who executes operations on behalf of the proxy user. More on this: > [https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/Superusers.html] > [https://github.com/spark-notebook/spark-notebook/blob/master/docs/proxyuser_impersonation.md] > This has been implemented for Yarn upstream and Spark on Mesos here: > [https://github.com/mesosphere/spark/pull/26] > [~ifilonenko] creating this issue according to our discussion. -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-25355) Support --proxy-user for Spark on K8s
[ https://issues.apache.org/jira/browse/SPARK-25355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17531777#comment-17531777 ] Gabor Somogyi commented on SPARK-25355: --- Hmmm, seems like I've already debugged such thing and created a project to print out UGI credentials: https://github.com/gaborgsomogyi/hadoop-token-trace > Support --proxy-user for Spark on K8s > - > > Key: SPARK-25355 > URL: https://issues.apache.org/jira/browse/SPARK-25355 > Project: Spark > Issue Type: Sub-task > Components: Kubernetes, Spark Core >Affects Versions: 3.1.0 >Reporter: Stavros Kontopoulos >Assignee: Pedro Rossi >Priority: Major > Fix For: 3.1.0 > > Attachments: client.log, driver.log > > > SPARK-23257 adds kerberized hdfs support for Spark on K8s. A major addition > needed is the support for proxy user. A proxy user is impersonated by a > superuser who executes operations on behalf of the proxy user. More on this: > [https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/Superusers.html] > [https://github.com/spark-notebook/spark-notebook/blob/master/docs/proxyuser_impersonation.md] > This has been implemented for Yarn upstream and Spark on Mesos here: > [https://github.com/mesosphere/spark/pull/26] > [~ifilonenko] creating this issue according to our discussion. -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-25355) Support --proxy-user for Spark on K8s
[ https://issues.apache.org/jira/browse/SPARK-25355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17531743#comment-17531743 ] Gabor Somogyi commented on SPARK-25355: --- Now I see where the driver blows up and why no Spark logs are available. So submit side obtains token for the HA namenodes one-by-one: {code:java} 22/05/04 04:13:07 DEBUG KMSClientProvider: Getting new token from http://nn.com:9292/kms/v1/, renewer:proxyUser ... 22/05/04 04:13:07 DEBUG KMSClientProvider: Getting new token from http://:9292/kms/v1/, renewer:proxyUser {code} Driver however tries to reach the following HDFS file during jar globbing: {code:java} 22/04/26 08:54:39 DEBUG HAUtilClient: No HA service delegation token found for logical URI hdfs://dpinonprod:8020/tmp/spark-upload-bf713a0c-166b-43fc-a5e6-24957e75b224/spark-examples_2.12-3.0.1.jar {code} Not sure from where this "dpinonprod" node comes from but Spark not obtained a token for that host. Previously I've seen that if HA and single namenode addresses are mixed up in configs then such AccessControlException happened. I would go through the configs and would double check them... (default FS, additional FS, etc...) If this doesn't help then maybe trace-agent can be used to dump the UGI credentials: https://github.com/gaborgsomogyi/trace-agent Please note, UGI is not yet supported just security credentials so it requires really some effort... > Support --proxy-user for Spark on K8s > - > > Key: SPARK-25355 > URL: https://issues.apache.org/jira/browse/SPARK-25355 > Project: Spark > Issue Type: Sub-task > Components: Kubernetes, Spark Core >Affects Versions: 3.1.0 >Reporter: Stavros Kontopoulos >Assignee: Pedro Rossi >Priority: Major > Fix For: 3.1.0 > > Attachments: client.log, driver.log > > > SPARK-23257 adds kerberized hdfs support for Spark on K8s. A major addition > needed is the support for proxy user. A proxy user is impersonated by a > superuser who executes operations on behalf of the proxy user. More on this: > [https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/Superusers.html] > [https://github.com/spark-notebook/spark-notebook/blob/master/docs/proxyuser_impersonation.md] > This has been implemented for Yarn upstream and Spark on Mesos here: > [https://github.com/mesosphere/spark/pull/26] > [~ifilonenko] creating this issue according to our discussion. -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-25355) Support --proxy-user for Spark on K8s
[ https://issues.apache.org/jira/browse/SPARK-25355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17531646#comment-17531646 ] Gabor Somogyi commented on SPARK-25355: --- > (IllegalArgumentException: Empty cookie header string) -> It's not supposed > to have any impact: https://issues.apache.org/jira/browse/HDFS-15136 OK, this is fixed in the used 3.1.1 hadoop version: https://github.com/apache/hadoop/blob/7caf768a8c9a639b6139b2cae8656c89e3d8c58d/hadoop-common-project/hadoop-auth/src/main/java/org/apache/hadoop/security/authentication/client/AuthenticatedURL.java#L101 Let's assume things looks good on the submit side. The driver needs to re-obtain tokens because in case of failure the submit client provided tokens are outdated and streaming workloads failed such scenarios. When I take a look at the driver logs I've no idea what's going on there because there are no Spark related logs available. Presume the log4j.properties stripped off all the useful info. I would expect to see at least the Spark version when SparkContext is created: https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/SparkContext.scala#L195 My ask is to enable Spark related log entries in the driver log to see what's going on. The most important is "org.apache.spark.deploy.security" package on DEBUG level where token handling sits. > Support --proxy-user for Spark on K8s > - > > Key: SPARK-25355 > URL: https://issues.apache.org/jira/browse/SPARK-25355 > Project: Spark > Issue Type: Sub-task > Components: Kubernetes, Spark Core >Affects Versions: 3.1.0 >Reporter: Stavros Kontopoulos >Assignee: Pedro Rossi >Priority: Major > Fix For: 3.1.0 > > Attachments: client.log, driver.log > > > SPARK-23257 adds kerberized hdfs support for Spark on K8s. A major addition > needed is the support for proxy user. A proxy user is impersonated by a > superuser who executes operations on behalf of the proxy user. More on this: > [https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/Superusers.html] > [https://github.com/spark-notebook/spark-notebook/blob/master/docs/proxyuser_impersonation.md] > This has been implemented for Yarn upstream and Spark on Mesos here: > [https://github.com/mesosphere/spark/pull/26] > [~ifilonenko] creating this issue according to our discussion. -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-25355) Support --proxy-user for Spark on K8s
[ https://issues.apache.org/jira/browse/SPARK-25355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17531523#comment-17531523 ] Gabor Somogyi edited comment on SPARK-25355 at 5/4/22 8:00 AM: --- After the attached logs now I see more. HADOOP_TOKEN_FILE_LOCATION with proxy user is never worked and it remains this. You have guys 2 options: * You provide tokens in HADOOP_TOKEN_FILE_LOCATION: this case UGI picks up the tokens for the current user and does authentication w/ that. Nothing blocks you guys that these tokens are generated for the proxy user manually from your custom code. This case --proxy-user config is not needed and will work like charm. * You set --proxy-user config and such case Spark obtains token for the proxy user authenticating w/ the real user Kerberos credentials. When I take a look at the logs Spark tries to obtain tokens for the following external service types {code:java} 22/05/04 04:13:07 DEBUG HadoopDelegationTokenManager: Using the following builtin delegation token providers: hadoopfs, hbase, hive. 22/05/04 04:13:07 DEBUG UserGroupInformation: PrivilegedAction as:proxyUser (auth:PROXY) via / (auth:KERBEROS) from:org.apache.spark.deploy.security.HadoopDelegationTokenManager.obtainDelegationTokens(HadoopDelegationTokenManager.scala:146) {code} After a while Spark's build-in Hadoop FS delegation token provider kicks-in and tries to obtain a token as expected: {code:java} 22/05/04 04:13:07 DEBUG HadoopFSDelegationTokenProvider: Delegation token renewer is: proxyUser 22/05/04 04:13:07 INFO HadoopFSDelegationTokenProvider: getting token for: DFS[DFSClient[clientName=DFSClient_NONMAPREDUCE_1812449855_1, ugi=proxyUser (auth:PROXY) via / (auth:KERBEROS)]] with renewer proxyUser 22/05/04 04:13:07 DEBUG Client: IPC Client (1939869193) connection to nn.com/:8020 from proxyUser sending #6 org.apache.hadoop.hdfs.protocol.ClientProtocol.getDelegationToken 22/05/04 04:13:07 DEBUG Client: IPC Client (1939869193) connection to nn.com/:8020 from proxyUser got value #6 22/05/04 04:13:07 DEBUG ProtobufRpcEngine: Call: getDelegationToken took 2ms 22/05/04 04:13:07 INFO DFSClient: Created token for proxyUser: HDFS_DELEGATION_TOKEN owner=proxyUser, renewer=proxyUser, realUser=/, issueDate=1651637587347, maxDate=1652242387347, sequenceNumber=183545, masterKeyId=606 on ha-hdfs: 22/05/04 04:13:07 DEBUG Client: IPC Client (1939869193) connection to nn.com/:8020 from proxyUser sending #7 org.apache.hadoop.hdfs.protocol.ClientProtocol.getServerDefaults 22/05/04 04:13:07 DEBUG Client: IPC Client (1939869193) connection to nn.com/:8020 from proxyUser got value #7 22/05/04 04:13:07 DEBUG ProtobufRpcEngine: Call: getServerDefaults took 0ms 22/05/04 04:13:07 DEBUG KMSClientProvider: KMSClientProvider for KMS url: http://nn.com:9292/kms/v1/ delegation token service: :9292 created. 22/05/04 04:13:07 DEBUG KMSClientProvider: KMSClientProvider for KMS url: http://:9292/kms/v1/ delegation token service: 10.207.184.25:9292 created. 22/05/04 04:13:07 DEBUG KMSClientProvider: Current UGI: proxyUser (auth:PROXY) via / (auth:KERBEROS) 22/05/04 04:13:07 DEBUG KMSClientProvider: Real UGI: / (auth:KERBEROS) 22/05/04 04:13:07 DEBUG KMSClientProvider: Login UGI: / (auth:KERBEROS) 22/05/04 04:13:07 DEBUG UserGroupInformation: PrivilegedAction as:/ (auth:KERBEROS) from:org.apache.hadoop.crypto.key.kms.KMSClientProvider.addDelegationTokens(KMSClientProvider.java:1037) 22/05/04 04:13:07 DEBUG KMSClientProvider: Getting new token from http://nn.com:9292/kms/v1/, renewer:proxyUser 22/05/04 04:13:07 DEBUG DelegationTokenAuthenticator: No delegation token found for url=http://nn.com:9292/kms/v1/?op=GETDELEGATIONTOKEN=proxyUser=proxyUser, token=, authenticating with class org.apache.hadoop.security.token.delegation.web.KerberosDelegationTokenAuthenticator$1 22/05/04 04:13:07 DEBUG KerberosAuthenticator: JDK performed authentication on our behalf. 22/05/04 04:13:07 DEBUG AuthenticatedURL: Cannot parse cookie header: java.lang.IllegalArgumentException: Empty cookie header string at java.net.HttpCookie.parseInternal(HttpCookie.java:826) at java.net.HttpCookie.parse(HttpCookie.java:202) at java.net.HttpCookie.parse(HttpCookie.java:178) at org.apache.hadoop.security.authentication.client.AuthenticatedURL$AuthCookieHandler.put(AuthenticatedURL.java:99) at org.apache.hadoop.security.authentication.client.AuthenticatedURL.extractToken(AuthenticatedURL.java:390) at org.apache.hadoop.security.authentication.client.KerberosAuthenticator.authenticate(KerberosAuthenticator.java:196) at org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticator.authenticate(DelegationTokenAuthenticator.java:147) at org.apache.hadoop.security.authentication.client.AuthenticatedURL.openConnection(AuthenticatedURL.java:348) at
[jira] [Commented] (SPARK-25355) Support --proxy-user for Spark on K8s
[ https://issues.apache.org/jira/browse/SPARK-25355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17531523#comment-17531523 ] Gabor Somogyi commented on SPARK-25355: --- After the attached logs now I see more. HADOOP_TOKEN_FILE_LOCATION with proxy user is never worked and it remains this. You have guys 2 options: * You provide tokens in HADOOP_TOKEN_FILE_LOCATION: this case UGI picks up the tokens for the current user and does authentication w/ that. Nothing blocks you guys that these tokens are generated for the proxy user manually from your custom code. This case --proxy-user config is not needed and will work like charm. * You set --proxy-user config and such case Spark obtains token for the proxy user authenticating w/ the real user Kerberos credentials. When I take a look at the logs Spark tries to obtain tokens for the following external service types {code:java} 22/05/04 04:13:07 DEBUG HadoopDelegationTokenManager: Using the following builtin delegation token providers: hadoopfs, hbase, hive. 22/05/04 04:13:07 DEBUG UserGroupInformation: PrivilegedAction as:proxyUser (auth:PROXY) via / (auth:KERBEROS) from:org.apache.spark.deploy.security.HadoopDelegationTokenManager.obtainDelegationTokens(HadoopDelegationTokenManager.scala:146) {code} After a while Spark's build-in Hadoop FS delegation token provider kicks-in and tries to obtain a token as expected: {code:java} 22/05/04 04:13:07 DEBUG HadoopFSDelegationTokenProvider: Delegation token renewer is: proxyUser 22/05/04 04:13:07 INFO HadoopFSDelegationTokenProvider: getting token for: DFS[DFSClient[clientName=DFSClient_NONMAPREDUCE_1812449855_1, ugi=proxyUser (auth:PROXY) via / (auth:KERBEROS)]] with renewer proxyUser 22/05/04 04:13:07 DEBUG Client: IPC Client (1939869193) connection to nn.com/:8020 from proxyUser sending #6 org.apache.hadoop.hdfs.protocol.ClientProtocol.getDelegationToken 22/05/04 04:13:07 DEBUG Client: IPC Client (1939869193) connection to nn.com/:8020 from proxyUser got value #6 22/05/04 04:13:07 DEBUG ProtobufRpcEngine: Call: getDelegationToken took 2ms 22/05/04 04:13:07 INFO DFSClient: Created token for proxyUser: HDFS_DELEGATION_TOKEN owner=proxyUser, renewer=proxyUser, realUser=/, issueDate=1651637587347, maxDate=1652242387347, sequenceNumber=183545, masterKeyId=606 on ha-hdfs: 22/05/04 04:13:07 DEBUG Client: IPC Client (1939869193) connection to nn.com/:8020 from proxyUser sending #7 org.apache.hadoop.hdfs.protocol.ClientProtocol.getServerDefaults 22/05/04 04:13:07 DEBUG Client: IPC Client (1939869193) connection to nn.com/:8020 from proxyUser got value #7 22/05/04 04:13:07 DEBUG ProtobufRpcEngine: Call: getServerDefaults took 0ms 22/05/04 04:13:07 DEBUG KMSClientProvider: KMSClientProvider for KMS url: http://nn.com:9292/kms/v1/ delegation token service: :9292 created. 22/05/04 04:13:07 DEBUG KMSClientProvider: KMSClientProvider for KMS url: http://:9292/kms/v1/ delegation token service: 10.207.184.25:9292 created. 22/05/04 04:13:07 DEBUG KMSClientProvider: Current UGI: proxyUser (auth:PROXY) via / (auth:KERBEROS) 22/05/04 04:13:07 DEBUG KMSClientProvider: Real UGI: / (auth:KERBEROS) 22/05/04 04:13:07 DEBUG KMSClientProvider: Login UGI: / (auth:KERBEROS) 22/05/04 04:13:07 DEBUG UserGroupInformation: PrivilegedAction as:/ (auth:KERBEROS) from:org.apache.hadoop.crypto.key.kms.KMSClientProvider.addDelegationTokens(KMSClientProvider.java:1037) 22/05/04 04:13:07 DEBUG KMSClientProvider: Getting new token from http://nn.com:9292/kms/v1/, renewer:proxyUser 22/05/04 04:13:07 DEBUG DelegationTokenAuthenticator: No delegation token found for url=http://nn.com:9292/kms/v1/?op=GETDELEGATIONTOKEN=proxyUser=proxyUser, token=, authenticating with class org.apache.hadoop.security.token.delegation.web.KerberosDelegationTokenAuthenticator$1 22/05/04 04:13:07 DEBUG KerberosAuthenticator: JDK performed authentication on our behalf. 22/05/04 04:13:07 DEBUG AuthenticatedURL: Cannot parse cookie header: java.lang.IllegalArgumentException: Empty cookie header string at java.net.HttpCookie.parseInternal(HttpCookie.java:826) at java.net.HttpCookie.parse(HttpCookie.java:202) at java.net.HttpCookie.parse(HttpCookie.java:178) at org.apache.hadoop.security.authentication.client.AuthenticatedURL$AuthCookieHandler.put(AuthenticatedURL.java:99) at org.apache.hadoop.security.authentication.client.AuthenticatedURL.extractToken(AuthenticatedURL.java:390) at org.apache.hadoop.security.authentication.client.KerberosAuthenticator.authenticate(KerberosAuthenticator.java:196) at org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticator.authenticate(DelegationTokenAuthenticator.java:147) at org.apache.hadoop.security.authentication.client.AuthenticatedURL.openConnection(AuthenticatedURL.java:348) at
[jira] [Commented] (SPARK-39033) Support --proxy-user for Spark on K8s not working
[ https://issues.apache.org/jira/browse/SPARK-39033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17531430#comment-17531430 ] Gabor Somogyi commented on SPARK-39033: --- Simply the logs are trash just like I've mentioned this in SPARK-25355 + in the dev list. Here ConnectException stays which is even worse than in SPARK-25355. Please improve or your issue is going to be left as-is... or maybe somebody would be s kind that does your job and repro the issue which is less probable... > Support --proxy-user for Spark on K8s not working > - > > Key: SPARK-39033 > URL: https://issues.apache.org/jira/browse/SPARK-39033 > Project: Spark > Issue Type: Bug > Components: Kubernetes >Affects Versions: 3.2.0 >Reporter: jagadeesh >Priority: Major > > we are running into problem when we submit spark job with --proxy-user on > K8s. > here are the setups follows, > * Service id is configured properly in HDFS side . > > {code:java} > > hadoop.proxyuser.serviceid.groups > * > > > hadoop.proxyuser.serviceid.hosts > * > > hadoop.proxyuser.serviceid.users > * > {code} > > * Getting service id Kerberos ticket in spark client. > * Running spark job without --proxy-user connecting to Kerberized HDFS > cluster - {color:#00875a}WORKS AS EXPECTED .{color} > * Running spark job with --proxy-user= connecting to Kerberized > HDFS cluster - {color:#de350b}FAILS{color} > {code:java} > $SPARK_HOME/bin/spark-submit \ > --master \ > --deploy-mode cluster \ > --proxy-user \ > --name spark-javawordcount \ > --class org.apache.spark.examples.JavaWordCount \ > --conf spark.kubernetes.container.image=\ > --conf spark.kubernetes.driver.podTemplateFile=driver.yaml \ > --conf spark.kubernetes.executor.podTemplateFile=executor.yaml \ > --conf spark.kubernetes.container.image.pullPolicy=Always \ > --conf spark.kubernetes.driver.limit.cores=1 \ > --conf spark.executor.instances=2 \ > --conf spark.kubernetes.kerberos.krb5.path=/etc/krb5.conf \ > --conf spark.kubernetes.authenticate.driver.serviceAccountName=spark \ > --conf spark.kubernetes.namespace= \ > --conf spark.eventLog.enabled=true \ > --conf spark.eventLog.dir=hdfs://:8020/scaas/shs_logs \ > --conf spark.kubernetes.file.upload.path=hdfs://:8020/tmp \ > $SPARK_HOME/examples/jars/spark-examples_2.12-3.2.0-1.jar > /user//input{code} > > * ERROR logs from Driver pod > > {code:java} > ++ id -u > + myuid=185 > ++ id -g > + mygid=0 > + set +e > ++ getent passwd 185 > + uidentry= > + set -e > + '[' -z '' ']' > + '[' -w /etc/passwd ']' > + echo '185:x:185:0:anonymous uid:/opt/spark:/bin/false' > + SPARK_CLASSPATH=':/opt/spark/jars/*' > + env > + grep SPARK_JAVA_OPT_ > + sort -t_ -k4 -n > + sed 's/[^=]*=\(.*\)/\1/g' > + readarray -t SPARK_EXECUTOR_JAVA_OPTS > + '[' -n '' ']' > + '[' -z ']' > + '[' -z ']' > + '[' -n '' ']' > + '[' -z x ']' > + SPARK_CLASSPATH='/opt/hadoop/conf::/opt/spark/jars/*' > + '[' -z x ']' > + SPARK_CLASSPATH='/opt/spark/conf:/opt/hadoop/conf::/opt/spark/jars/*' > + case "$1" in > + shift 1 > + CMD=("$SPARK_HOME/bin/spark-submit" --conf > "spark.driver.bindAddress=$SPARK_DRIVER_BIND_ADDRESS" --deploy-mode client > "$@") > + exec /usr/bin/tini -s -- /opt/spark/bin/spark-submit --conf > spark.driver.bindAddress= --deploy-mode client --proxy-user > --properties-file /opt/spark/conf/spark.properties --class > org.apache.spark.examples.JavaWordCount spark-internal /user//input > WARNING: An illegal reflective access operation has occurred > WARNING: Illegal reflective access by org.apache.spark.unsafe.Platform > (file:/opt/spark/jars/spark-unsafe_2.12-3.2.0-1.jar) to constructor > java.nio.DirectByteBuffer(long,int) > WARNING: Please consider reporting this to the maintainers of > org.apache.spark.unsafe.Platform > WARNING: Use --illegal-access=warn to enable warnings of further illegal > reflective access operations > WARNING: All illegal access operations will be denied in a future release > 22/04/21 17:50:30 WARN NativeCodeLoader: Unable to load native-hadoop library > for your platform... using builtin-java classes where applicable > 22/04/21 17:50:30 WARN DomainSocketFactory: The short-circuit local reads > feature cannot be used because libhadoop cannot be loaded. > 22/04/21 17:50:30 WARN Client: Exception encountered while connecting to the > server : org.apache.hadoop.security.AccessControlException: Client cannot > authenticate via:[TOKEN, KERBEROS] > 22/04/21 17:50:31 WARN Client: Exception encountered while connecting to the > server : org.apache.hadoop.security.AccessControlException: Client cannot > authenticate via:[TOKEN, KERBEROS] > 22/04/21 17:50:37 WARN Client: Exception
[jira] [Commented] (SPARK-25355) Support --proxy-user for Spark on K8s
[ https://issues.apache.org/jira/browse/SPARK-25355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17530812#comment-17530812 ] Gabor Somogyi commented on SPARK-25355: --- > Thanks for looking further. Your assumption that 3 tokens loaded from > HADOOP_TOKEN_FILE_LOCATION are not compatible to do the authentication is > wrong. Please be aware that I'm one of the authors of this delegation token framework and I'm not guessing but knowing exactly what's going on. The only question is what you guys are planning and doing :) Since you've not yet provided full logs, what is the master plan, how the authentication is planned I'm asking simple questions. If not answered then I'm not able to help you forward. * I've asked full driver and executor logs but we've received a hadoop specific snippet. Can we get a full log as asked? If too large then stored externally or something. * In spark-submit you provide cluster mode deployment {code:java} ... --deploy-mode cluster \ ... {code} but in the log I see client mode: {code:java} ... + exec /usr/bin/tini -s -- /opt/spark/bin/spark-submit --conf spark.driver.bindAddress=10.4.201.155 --deploy-mode client --proxy-user shrprasa --properties-file /opt/spark/conf/spark.properties --class org.apache.spark.examples.SparkPi spark-internal ... {code} So which one is the source of truth because it has major influence how security is working? Hunting multiple issues is not fun (same issue like ConnectionRefused in the dev mailing list). So the ask here is to provide full logs and submit command which belongs together. * What is the master plan to provide a TGT for the current user on the driver POD? I'm asking it because this is the only way to ask Spark to obtain a delegation token for the proxy user. But since the logs are partial I'm also not able to tell what happened there. * What is the main intention to use HADOOP_TOKEN_FILE_LOCATION? That is mainly used to load tokens for the current user and not for the proxy user. Taking over any token to the proxy user is never going to happen because that would mean a security breach. * And finally which token do you expect to do authentication against HDFS? (Spark obtained one or loaded by HADOOP_TOKEN_FILE_LOCATION) [~pedro.rossi] how it is tested on cluster because the description of the PR doesn't tell anything about that? > Support --proxy-user for Spark on K8s > - > > Key: SPARK-25355 > URL: https://issues.apache.org/jira/browse/SPARK-25355 > Project: Spark > Issue Type: Sub-task > Components: Kubernetes, Spark Core >Affects Versions: 3.1.0 >Reporter: Stavros Kontopoulos >Assignee: Pedro Rossi >Priority: Major > Fix For: 3.1.0 > > > SPARK-23257 adds kerberized hdfs support for Spark on K8s. A major addition > needed is the support for proxy user. A proxy user is impersonated by a > superuser who executes operations on behalf of the proxy user. More on this: > [https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/Superusers.html] > [https://github.com/spark-notebook/spark-notebook/blob/master/docs/proxyuser_impersonation.md] > This has been implemented for Yarn upstream and Spark on Mesos here: > [https://github.com/mesosphere/spark/pull/26] > [~ifilonenko] creating this issue according to our discussion. -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-25355) Support --proxy-user for Spark on K8s
[ https://issues.apache.org/jira/browse/SPARK-25355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17530692#comment-17530692 ] Gabor Somogyi commented on SPARK-25355: --- I've had a further look and found the following: {code:java} ... 22/04/26 08:54:40 DEBUG SaslRpcClient: Sending sasl message state: NEGOTIATE22/04/26 08:54:40 DEBUG SaslRpcClient: Get token info proto:interface org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolPB info:@org.apache.hadoop.security.token.TokenInfo(value=org.apache.hadoop.hdfs.security.token.delegation.DelegationTokenSelector.class) 22/04/26 08:54:40 DEBUG SaslRpcClient: tokens aren't supported for this protocol or user doesn't have one 22/04/26 08:54:40 DEBUG SaslRpcClient: client isn't using kerberos 22/04/26 08:54:40 DEBUG UserGroupInformation: PrivilegedActionException as:185 (auth:SIMPLE) cause:org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN, KERBEROS] ... {code} This means you guys ordered Spark to load 3 tokens from HADOOP_TOKEN_FILE_LOCATION which are not compatible to do the authentication. Since I've no idea what was the original intention all I can suggest please provide the proper tokens in HADOOP_TOKEN_FILE_LOCATION file. > Support --proxy-user for Spark on K8s > - > > Key: SPARK-25355 > URL: https://issues.apache.org/jira/browse/SPARK-25355 > Project: Spark > Issue Type: Sub-task > Components: Kubernetes, Spark Core >Affects Versions: 3.1.0 >Reporter: Stavros Kontopoulos >Assignee: Pedro Rossi >Priority: Major > Fix For: 3.1.0 > > > SPARK-23257 adds kerberized hdfs support for Spark on K8s. A major addition > needed is the support for proxy user. A proxy user is impersonated by a > superuser who executes operations on behalf of the proxy user. More on this: > [https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/Superusers.html] > [https://github.com/spark-notebook/spark-notebook/blob/master/docs/proxyuser_impersonation.md] > This has been implemented for Yarn upstream and Spark on Mesos here: > [https://github.com/mesosphere/spark/pull/26] > [~ifilonenko] creating this issue according to our discussion. -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-25355) Support --proxy-user for Spark on K8s
[ https://issues.apache.org/jira/browse/SPARK-25355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17530678#comment-17530678 ] Gabor Somogyi edited comment on SPARK-25355 at 5/2/22 11:32 AM: Guys, when I take a look at the logs and hear what you say honestly not fully understand what you do :) You're telling that you do kinit which creates a TGT in the users credentials cache on the local machine. Please be aware that this TGT is NOT transferred by default to the cluster. On the other hand the driver is reading credentials from file: {code:java} ... 22/04/26 08:54:39 DEBUG UserGroupInformation: Loaded 3 tokens 22/04/26 08:54:39 DEBUG UserGroupInformation: UGI loginUser:185 (auth:SIMPLE) 22/04/26 08:54:39 DEBUG UserGroupInformation: PrivilegedAction as:shrprasa (auth:PROXY) via 185 (auth:SIMPLE) ... 22/04/26 08:54:38 DEBUG UserGroupInformation: Reading credentials from location set in HADOOP_TOKEN_FILE_LOCATION: /mnt/secrets/hadoop-credentials/..2022_04_26_08_54_34.1262645511/hadoop-tokens ... {code} One can authenticate from both credentials (TGT and HADOOP_TOKEN_FILE_LOCATION) so which one is the plan and which one is a side effect? As a general suggestion client mode kerberos authentication suffers from many issue especially with TGT so not advised. If you want a peaceful life then I warmly suggest to use keytab :) was (Author: gaborgsomogyi): Guys, when I take a look at the logs and hear what you say honestly not fully understand what you do :) You're telling that you do kinit which creates a TGT in the users credentials cache on the local machine. Please be aware that this TGT is NOT transferred by default to the cluster. On the other hand the driver is reading credentials from file: {code:java} ... 22/04/26 08:54:38 DEBUG UserGroupInformation: Reading credentials from location set in HADOOP_TOKEN_FILE_LOCATION: /mnt/secrets/hadoop-credentials/..2022_04_26_08_54_34.1262645511/hadoop-tokens ... {code} One can authenticate from both credentials (TGT and HADOOP_TOKEN_FILE_LOCATION) so which one is the plan and which one is a side effect? As a general suggestion client mode kerberos authentication suffers from many issue especially with TGT so not advised. If you want a peaceful life then I warmly suggest to use keytab :) > Support --proxy-user for Spark on K8s > - > > Key: SPARK-25355 > URL: https://issues.apache.org/jira/browse/SPARK-25355 > Project: Spark > Issue Type: Sub-task > Components: Kubernetes, Spark Core >Affects Versions: 3.1.0 >Reporter: Stavros Kontopoulos >Assignee: Pedro Rossi >Priority: Major > Fix For: 3.1.0 > > > SPARK-23257 adds kerberized hdfs support for Spark on K8s. A major addition > needed is the support for proxy user. A proxy user is impersonated by a > superuser who executes operations on behalf of the proxy user. More on this: > [https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/Superusers.html] > [https://github.com/spark-notebook/spark-notebook/blob/master/docs/proxyuser_impersonation.md] > This has been implemented for Yarn upstream and Spark on Mesos here: > [https://github.com/mesosphere/spark/pull/26] > [~ifilonenko] creating this issue according to our discussion. -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-25355) Support --proxy-user for Spark on K8s
[ https://issues.apache.org/jira/browse/SPARK-25355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17530678#comment-17530678 ] Gabor Somogyi commented on SPARK-25355: --- Guys, when I take a look at the logs and hear what you say honestly not fully understand what you do :) You're telling that you do kinit which creates a TGT in the users credentials cache on the local machine. Please be aware that this TGT is NOT transferred by default to the cluster. On the other hand the driver is reading credentials from file: {code:java} ... 22/04/26 08:54:38 DEBUG UserGroupInformation: Reading credentials from location set in HADOOP_TOKEN_FILE_LOCATION: /mnt/secrets/hadoop-credentials/..2022_04_26_08_54_34.1262645511/hadoop-tokens ... {code} One can authenticate from both credentials (TGT and HADOOP_TOKEN_FILE_LOCATION) so which one is the plan and which one is a side effect? As a general suggestion client mode kerberos authentication suffers from many issue especially with TGT so not advised. If you want a peaceful life then I warmly suggest to use keytab :) > Support --proxy-user for Spark on K8s > - > > Key: SPARK-25355 > URL: https://issues.apache.org/jira/browse/SPARK-25355 > Project: Spark > Issue Type: Sub-task > Components: Kubernetes, Spark Core >Affects Versions: 3.1.0 >Reporter: Stavros Kontopoulos >Assignee: Pedro Rossi >Priority: Major > Fix For: 3.1.0 > > > SPARK-23257 adds kerberized hdfs support for Spark on K8s. A major addition > needed is the support for proxy user. A proxy user is impersonated by a > superuser who executes operations on behalf of the proxy user. More on this: > [https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/Superusers.html] > [https://github.com/spark-notebook/spark-notebook/blob/master/docs/proxyuser_impersonation.md] > This has been implemented for Yarn upstream and Spark on Mesos here: > [https://github.com/mesosphere/spark/pull/26] > [~ifilonenko] creating this issue according to our discussion. -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-25355) Support --proxy-user for Spark on K8s
[ https://issues.apache.org/jira/browse/SPARK-25355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17530650#comment-17530650 ] Gabor Somogyi commented on SPARK-25355: --- If the issue still persists then please provide the community the submit command together with the driver and executor logs... > Support --proxy-user for Spark on K8s > - > > Key: SPARK-25355 > URL: https://issues.apache.org/jira/browse/SPARK-25355 > Project: Spark > Issue Type: Sub-task > Components: Kubernetes, Spark Core >Affects Versions: 3.1.0 >Reporter: Stavros Kontopoulos >Assignee: Pedro Rossi >Priority: Major > Fix For: 3.1.0 > > > SPARK-23257 adds kerberized hdfs support for Spark on K8s. A major addition > needed is the support for proxy user. A proxy user is impersonated by a > superuser who executes operations on behalf of the proxy user. More on this: > [https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/Superusers.html] > [https://github.com/spark-notebook/spark-notebook/blob/master/docs/proxyuser_impersonation.md] > This has been implemented for Yarn upstream and Spark on Mesos here: > [https://github.com/mesosphere/spark/pull/26] > [~ifilonenko] creating this issue according to our discussion. -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-25355) Support --proxy-user for Spark on K8s
[ https://issues.apache.org/jira/browse/SPARK-25355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17530649#comment-17530649 ] Gabor Somogyi commented on SPARK-25355: --- Then there are 2 issues on your side guys. In the following thread ConnectionRefused exception is mentioned: https://lists.apache.org/thread/lcn90cs9b0m848yfd5g4ksxsqwkmqbts So which one is the issue or both? > Support --proxy-user for Spark on K8s > - > > Key: SPARK-25355 > URL: https://issues.apache.org/jira/browse/SPARK-25355 > Project: Spark > Issue Type: Sub-task > Components: Kubernetes, Spark Core >Affects Versions: 3.1.0 >Reporter: Stavros Kontopoulos >Assignee: Pedro Rossi >Priority: Major > Fix For: 3.1.0 > > > SPARK-23257 adds kerberized hdfs support for Spark on K8s. A major addition > needed is the support for proxy user. A proxy user is impersonated by a > superuser who executes operations on behalf of the proxy user. More on this: > [https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/Superusers.html] > [https://github.com/spark-notebook/spark-notebook/blob/master/docs/proxyuser_impersonation.md] > This has been implemented for Yarn upstream and Spark on Mesos here: > [https://github.com/mesosphere/spark/pull/26] > [~ifilonenko] creating this issue according to our discussion. -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-25355) Support --proxy-user for Spark on K8s
[ https://issues.apache.org/jira/browse/SPARK-25355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17530370#comment-17530370 ] Gabor Somogyi edited comment on SPARK-25355 at 4/30/22 9:50 AM: Please make sure that namenode host:port is available from driver pod. If that's working then the ConnectionRefused must go away and only AccessControlException remains (if authentication is still an issue). was (Author: gaborgsomogyi): Please make sure that namenode host:port is available from driver. If that's working then the ConnectionRefused must go away and only AccessControlException remains (if authentication is still an issue). > Support --proxy-user for Spark on K8s > - > > Key: SPARK-25355 > URL: https://issues.apache.org/jira/browse/SPARK-25355 > Project: Spark > Issue Type: Sub-task > Components: Kubernetes, Spark Core >Affects Versions: 3.1.0 >Reporter: Stavros Kontopoulos >Assignee: Pedro Rossi >Priority: Major > Fix For: 3.1.0 > > > SPARK-23257 adds kerberized hdfs support for Spark on K8s. A major addition > needed is the support for proxy user. A proxy user is impersonated by a > superuser who executes operations on behalf of the proxy user. More on this: > [https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/Superusers.html] > [https://github.com/spark-notebook/spark-notebook/blob/master/docs/proxyuser_impersonation.md] > This has been implemented for Yarn upstream and Spark on Mesos here: > [https://github.com/mesosphere/spark/pull/26] > [~ifilonenko] creating this issue according to our discussion. -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-25355) Support --proxy-user for Spark on K8s
[ https://issues.apache.org/jira/browse/SPARK-25355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17530370#comment-17530370 ] Gabor Somogyi commented on SPARK-25355: --- Please make sure that namenode host:port is available from driver. If that's working then the ConnectionRefused must go away and only AccessControlException remains (if authentication is still an issue). > Support --proxy-user for Spark on K8s > - > > Key: SPARK-25355 > URL: https://issues.apache.org/jira/browse/SPARK-25355 > Project: Spark > Issue Type: Sub-task > Components: Kubernetes, Spark Core >Affects Versions: 3.1.0 >Reporter: Stavros Kontopoulos >Assignee: Pedro Rossi >Priority: Major > Fix For: 3.1.0 > > > SPARK-23257 adds kerberized hdfs support for Spark on K8s. A major addition > needed is the support for proxy user. A proxy user is impersonated by a > superuser who executes operations on behalf of the proxy user. More on this: > [https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/Superusers.html] > [https://github.com/spark-notebook/spark-notebook/blob/master/docs/proxyuser_impersonation.md] > This has been implemented for Yarn upstream and Spark on Mesos here: > [https://github.com/mesosphere/spark/pull/26] > [~ifilonenko] creating this issue according to our discussion. -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-25355) Support --proxy-user for Spark on K8s
[ https://issues.apache.org/jira/browse/SPARK-25355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17530369#comment-17530369 ] Gabor Somogyi commented on SPARK-25355: --- I've answered this question in the community but I would copy it here to track it in the jira: Please be aware that ConnectionRefused exception has nothing to do w/ authentication. See the description from Hadoop wiki: "You get a ConnectionRefused Exception when there is a machine at the address specified, but there is no program listening on the specific TCP port the client is using -and there is no firewall in the way silently dropping TCP connection requests. If you do not know what a TCP connection request is, please consult the specification." This means the namenode on host:port is not reachable in the TCP layer. Maybe there are multiple issues but I'm pretty sure that something is wrong in the K8S net config. > Support --proxy-user for Spark on K8s > - > > Key: SPARK-25355 > URL: https://issues.apache.org/jira/browse/SPARK-25355 > Project: Spark > Issue Type: Sub-task > Components: Kubernetes, Spark Core >Affects Versions: 3.1.0 >Reporter: Stavros Kontopoulos >Assignee: Pedro Rossi >Priority: Major > Fix For: 3.1.0 > > > SPARK-23257 adds kerberized hdfs support for Spark on K8s. A major addition > needed is the support for proxy user. A proxy user is impersonated by a > superuser who executes operations on behalf of the proxy user. More on this: > [https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/Superusers.html] > [https://github.com/spark-notebook/spark-notebook/blob/master/docs/proxyuser_impersonation.md] > This has been implemented for Yarn upstream and Spark on Mesos here: > [https://github.com/mesosphere/spark/pull/26] > [~ifilonenko] creating this issue according to our discussion. -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-28173) Add Kafka delegation token proxy user support
[ https://issues.apache.org/jira/browse/SPARK-28173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17502162#comment-17502162 ] Gabor Somogyi commented on SPARK-28173: --- Looks like there is a willingness to merge the Kafka part. Hope we can kick this forward relatively soon to finish this feature. > Add Kafka delegation token proxy user support > - > > Key: SPARK-28173 > URL: https://issues.apache.org/jira/browse/SPARK-28173 > Project: Spark > Issue Type: Improvement > Components: Structured Streaming >Affects Versions: 3.1.0 >Reporter: Gabor Somogyi >Priority: Major > > In SPARK-26592 I've turned off proxy user usage because > https://issues.apache.org/jira/browse/KAFKA-6945 is not yet implemented. > Since the KIP will be under discussion and hopefully implemented here is this > jira to track the Spark side effort. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-37391) SIGNIFICANT bottleneck introduced by fix for SPARK-32001
[ https://issues.apache.org/jira/browse/SPARK-37391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17447829#comment-17447829 ] Gabor Somogyi commented on SPARK-37391: --- [~hyukjin.kwon] thanks for pinging me. I've added my comment here: https://github.com/apache/spark/pull/29024/files#r754476290 The problem and the surrounding constraints are provided. If somebody has a meaningful solution then please share. To sum it up here: A single JVM has only one security context and JDBC clients are able to read authentication credentials only from there which is a bottleneck. > SIGNIFICANT bottleneck introduced by fix for SPARK-32001 > > > Key: SPARK-37391 > URL: https://issues.apache.org/jira/browse/SPARK-37391 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.1.0, 3.1.1, 3.1.2, 3.2.0 > Environment: N/A >Reporter: Danny Guinther >Priority: Major > Attachments: so-much-blocking.jpg, spark-regression-dashes.jpg > > > The fix for https://issues.apache.org/jira/browse/SPARK-32001 ( > [https://github.com/apache/spark/pull/29024/files#diff-345beef18081272d77d91eeca2d9b5534ff6e642245352f40f4e9c9b8922b085R58] > ) does not seem to have consider the reality that some apps may rely on > being able to establish many JDBC connections simultaneously for performance > reasons. > The fix forces concurrency to 1 when establishing database connections and > that strikes me as a *significant* user impacting change and a *significant* > bottleneck. > Can anyone propose a workaround for this? I have an app that makes > connections to thousands of databases and I can't upgrade to any version > >3.1.x because of this significant bottleneck. > > Thanks in advance for your help! -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-36765) Spark Support for MS Sql JDBC connector with Kerberos/Keytab
[ https://issues.apache.org/jira/browse/SPARK-36765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17416519#comment-17416519 ] Gabor Somogyi commented on SPARK-36765: --- It was long time ago when I've done that and AFAIR it took me almost a month to make it work so definitely a horror task! My knowledge is cloudy because it was not yesterday but I remember something like this: The exception generally indicates that the driver can not find the appropriate sqljdbc_auth lib in the JVM library path. To correct the problem, one can use use the java -D option to specify the "java.library.path" system property value. Worth to mention full path must be set as path, otherwise it was not working. All in all I've faced at least 5-6 different issues which were extremely hard to address. Hope others need less time to solve the issues. > Spark Support for MS Sql JDBC connector with Kerberos/Keytab > > > Key: SPARK-36765 > URL: https://issues.apache.org/jira/browse/SPARK-36765 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.1.2 > Environment: Unix Redhat Environment >Reporter: Dilip Thallam Sridhar >Priority: Major > Fix For: 3.1.2 > > > Hi Team, > > We are using the Spark-3.0.2 to connect to MS SqlServer with the following > instruction > Also tried with the Spark-3.1.2 Version, > > 1) download mssql-jdbc-9.4.0.jre8.jar > 2) Generated Keytab using kinit > 3) Validate Keytab using klist > 4) Run the spark job with jdbc_library, principal and keytabs passed > .config("spark.driver.extraClassPath", spark_jar_lib) \ > .config("spark.executor.extraClassPath", spark_jar_lib) \ > 5) connection_url = > "jdbc:sqlserver://{}:{};databaseName={};integratedSecurity=true;authenticationSchema=JavaKerberos"\ > .format(jdbc_host_name, jdbc_port, jdbc_database_name) > Note: without integratedSecurity=true;authenticationSchema=JavaKerberos it > looks for the usual username/password option to connect > 6) passing the following options during spark read. > .option("principal", database_principal) \ > .option("files", database_keytab) \ > .option("keytab", database_keytab) \ > > tried with files and keytab, just files, and with all above 3 parameters > > We are unable to connect to SqlServer from Spark and getting the following > error shown below. > > A) Wanted to know if anybody was successful Spark to SqlServer? (as I see > the previous Jira has been closed) > https://issues.apache.org/jira/browse/SPARK-12312 > https://issues.apache.org/jira/browse/SPARK-31337 > > B) If yes, could you let us know if there are any additional configs needed > for Spark to connect to SqlServer please? > Appreciate if we can get inputs to resolve this error. > > > Full Stack Trace > {code} > Caused by: com.microsoft.sqlserver.jdbc.SQLServerException: This driver is > not configured for integrated authentication. at > com.microsoft.sqlserver.jdbc.SQLServerConnection.terminate(SQLServerConnection.java:1352) > at > com.microsoft.sqlserver.jdbc.SQLServerConnection.sendLogon(SQLServerConnection.java:2329) > at > com.microsoft.sqlserver.jdbc.SQLServerConnection.logon(SQLServerConnection.java:1905) > at > com.microsoft.sqlserver.jdbc.SQLServerConnection.access$000(SQLServerConnection.java:41) > at > com.microsoft.sqlserver.jdbc.SQLServerConnection$LogonCommand.doExecute(SQLServerConnection.java:1893) > at > com.microsoft.sqlserver.jdbc.TDSCommand.execute(IOBuffer.java:4575) > at > com.microsoft.sqlserver.jdbc.SQLServerConnection.executeCommand(SQLServerConnection.java:1400) > at > com.microsoft.sqlserver.jdbc.SQLServerConnection.connectHelper(SQLServerConnection.java:1045) > at > com.microsoft.sqlserver.jdbc.SQLServerConnection.login(SQLServerConnection.java:817) > at > com.microsoft.sqlserver.jdbc.SQLServerConnection.connect(SQLServerConnection.java:700) > at > com.microsoft.sqlserver.jdbc.SQLServerDriver.connect(SQLServerDriver.java:842) > at > org.apache.spark.sql.execution.datasources.jdbc.connection.BasicConnectionProvider.getConnection(BasicConnectionProvider.scala:49) > at > org.apache.spark.sql.execution.datasources.jdbc.connection.SecureConnectionProvider.getConnection(SecureConnectionProvider.scala:44) > at > org.apache.spark.sql.execution.datasources.jdbc.connection.MSSQLConnectionProvider.org$apache$spark$sql$execution$datasources$jdbc$connection$MSSQLConnectionProvider$$super$getConnection(MSSQLConnectionProvider.scala:69) > at > org.apache.spark.sql.execution.datasources.jdbc.connection.MSSQLConnectionProvider$$anon$1.run(MSSQLConnectionProvider.scala:69) > at >
[jira] [Commented] (SPARK-31460) spark-sql-kafka source in spark 2.4.4 causes reading stream failure frequently
[ https://issues.apache.org/jira/browse/SPARK-31460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17399594#comment-17399594 ] Gabor Somogyi commented on SPARK-31460: --- When such thing happens then Kafka connector is super slow and/or stuck in infinite loop so Kafka logs need to be checked why it's happening... Apart from that we've made quite some improvements on 3.x line so please double check the behavior w/ the latest Spark version. > spark-sql-kafka source in spark 2.4.4 causes reading stream failure frequently > -- > > Key: SPARK-31460 > URL: https://issues.apache.org/jira/browse/SPARK-31460 > Project: Spark > Issue Type: Bug > Components: Structured Streaming >Affects Versions: 2.4.4 >Reporter: vinay >Priority: Major > Original Estimate: 24h > Remaining Estimate: 24h > > In spark 2.4.4 , it provides a source "spark-sql-kafka-0-10_2.11". > > When I wanted to read from my kafka-0.10.2.11 cluster, it throws out an error > "*java.util.concurrent.TimeoutException: Cannot fetch record for offset x > in 1000 milliseconds*" frequently, and the job thus failed. > > I see this issue was seen before in 2.3 according to ticket 23829 and an > upgrade to spark 2.4 was supposed to solve this. > > {code:java} > compile group: 'org.apache.spark', name: 'spark-sql-kafka-0-10_2.11', > version: '2.4.4'{code} > Here is the error stack. > {code:java} > org.apache.spark.SparkException: Writing job aborted. > > org.apache.spark.sql.execution.datasources.v2.WriteToDataSourceV2Exec.doExecute(WriteToDataSourceV2Exec.scala:92) > > org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:131) > > org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:127) > > org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:155) > > org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151) > org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:152) > org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:127) > org.apache.spark.sql.execution.SparkPlan.getByteArrayRdd(SparkPlan.scala:247) > org.apache.spark.sql.execution.SparkPlan.executeCollect(SparkPlan.scala:296) > > org.apache.spark.sql.Dataset.org$apache$spark$sql$Dataset$$collectFromPlan(Dataset.scala:3389) > org.apache.spark.sql.Dataset$$anonfun$collect$1.apply(Dataset.scala:2788) > org.apache.spark.sql.Dataset$$anonfun$collect$1.apply(Dataset.scala:2788) > org.apache.spark.sql.Dataset$$anonfun$52.apply(Dataset.scala:3370) > > org.apache.spark.sql.execution.SQLExecution$$anonfun$withNewExecutionId$1.apply(SQLExecution.scala:78) > > org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:125) > > org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:73) > org.apache.spark.sql.Dataset.withAction(Dataset.scala:3369) > org.apache.spark.sql.Dataset.collect(Dataset.scala:2788) > org.apache.spark.sql.execution.streaming.MicroBatchExecution$$anonfun$org$apache$spark$sql$execution$streaming$MicroBatchExecution$$runBatch$5$$anonfun$apply$17.apply(MicroBatchExecution.scala:540) > > org.apache.spark.sql.execution.SQLExecution$$anonfun$withNewExecutionId$1.apply(SQLExecution.scala:78) > > org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:125) > > org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:73) > org.apache.spark.sql.execution.streaming.MicroBatchExecution$$anonfun$org$apache$spark$sql$execution$streaming$MicroBatchExecution$$runBatch$5.apply(MicroBatchExecution.scala:535) > > org.apache.spark.sql.execution.streaming.ProgressReporter$class.reportTimeTaken(ProgressReporter.scala:351) > > org.apache.spark.sql.execution.streaming.StreamExecution.reportTimeTaken(StreamExecution.scala:58) > org.apache.spark.sql.execution.streaming.MicroBatchExecution.org$apache$spark$sql$execution$streaming$MicroBatchExecution$$runBatch(MicroBatchExecution.scala:534) > org.apache.spark.sql.execution.streaming.MicroBatchExecution$$anonfun$runActivatedStream$1$$anonfun$apply$mcZ$sp$1.apply$mcV$sp(MicroBatchExecution.scala:198) > org.apache.spark.sql.execution.streaming.MicroBatchExecution$$anonfun$runActivatedStream$1$$anonfun$apply$mcZ$sp$1.apply(MicroBatchExecution.scala:166) > org.apache.spark.sql.execution.streaming.MicroBatchExecution$$anonfun$runActivatedStream$1$$anonfun$apply$mcZ$sp$1.apply(MicroBatchExecution.scala:166) > > org.apache.spark.sql.execution.streaming.ProgressReporter$class.reportTimeTaken(ProgressReporter.scala:351) > > org.apache.spark.sql.execution.streaming.StreamExecution.reportTimeTaken(StreamExecution.scala:58) >
[jira] [Created] (SPARK-35993) Flaky test: org.apache.spark.sql.execution.streaming.state.RocksDBSuite.ensure that concurrent update and cleanup consistent versions
Gabor Somogyi created SPARK-35993: - Summary: Flaky test: org.apache.spark.sql.execution.streaming.state.RocksDBSuite.ensure that concurrent update and cleanup consistent versions Key: SPARK-35993 URL: https://issues.apache.org/jira/browse/SPARK-35993 Project: Spark Issue Type: Bug Components: Spark Core, Tests Affects Versions: 3.1.2 Reporter: Gabor Somogyi Appeared in jenkins: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140575/testReport/org.apache.spark.sql.execution.streaming.state/RocksDBSuite/ensure_that_concurrent_update_and_cleanup_consistent_versions/ {code:java} Error Message java.io.FileNotFoundException: File /home/jenkins/workspace/SparkPullRequestBuilder@2/target/tmp/spark-21674620-ac83-4ad3-a153-5a7adf909244/20.zip does not exist Stacktrace sbt.ForkMain$ForkError: java.io.FileNotFoundException: File /home/jenkins/workspace/SparkPullRequestBuilder@2/target/tmp/spark-21674620-ac83-4ad3-a153-5a7adf909244/20.zip does not exist at org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:779) at org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:1100) at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:769) at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:462) at org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSInputChecker.(ChecksumFileSystem.java:160) at org.apache.hadoop.fs.ChecksumFileSystem.open(ChecksumFileSystem.java:372) at org.apache.spark.DebugFilesystem.open(DebugFilesystem.scala:74) at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:976) at org.apache.spark.util.Utils$.unzipFilesFromFile(Utils.scala:3132) at org.apache.spark.sql.execution.streaming.state.RocksDBFileManager.loadCheckpointFromDfs(RocksDBFileManager.scala:174) at org.apache.spark.sql.execution.streaming.state.RocksDB.load(RocksDB.scala:103) at org.apache.spark.sql.execution.streaming.state.RocksDBSuite.withDB(RocksDBSuite.scala:443) at org.apache.spark.sql.execution.streaming.state.RocksDBSuite.$anonfun$new$57(RocksDBSuite.scala:397) at org.apache.spark.sql.catalyst.util.package$.quietly(package.scala:42) at org.apache.spark.sql.execution.streaming.state.RocksDBSuite.$anonfun$new$56(RocksDBSuite.scala:341) at org.scalatest.OutcomeOf.outcomeOf(OutcomeOf.scala:85) at org.scalatest.OutcomeOf.outcomeOf$(OutcomeOf.scala:83) at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104) at org.scalatest.Transformer.apply(Transformer.scala:22) at org.scalatest.Transformer.apply(Transformer.scala:20) at org.scalatest.funsuite.AnyFunSuiteLike$$anon$1.apply(AnyFunSuiteLike.scala:226) at org.apache.spark.SparkFunSuite.withFixture(SparkFunSuite.scala:190) at org.scalatest.funsuite.AnyFunSuiteLike.invokeWithFixture$1(AnyFunSuiteLike.scala:224) at org.scalatest.funsuite.AnyFunSuiteLike.$anonfun$runTest$1(AnyFunSuiteLike.scala:236) at org.scalatest.SuperEngine.runTestImpl(Engine.scala:306) at org.scalatest.funsuite.AnyFunSuiteLike.runTest(AnyFunSuiteLike.scala:236) at org.scalatest.funsuite.AnyFunSuiteLike.runTest$(AnyFunSuiteLike.scala:218) at org.apache.spark.SparkFunSuite.org$scalatest$BeforeAndAfterEach$$super$runTest(SparkFunSuite.scala:62) at org.scalatest.BeforeAndAfterEach.runTest(BeforeAndAfterEach.scala:234) at org.scalatest.BeforeAndAfterEach.runTest$(BeforeAndAfterEach.scala:227) at org.apache.spark.SparkFunSuite.runTest(SparkFunSuite.scala:62) at org.scalatest.funsuite.AnyFunSuiteLike.$anonfun$runTests$1(AnyFunSuiteLike.scala:269) at org.scalatest.SuperEngine.$anonfun$runTestsInBranch$1(Engine.scala:413) at scala.collection.immutable.List.foreach(List.scala:431) at org.scalatest.SuperEngine.traverseSubNodes$1(Engine.scala:401) at org.scalatest.SuperEngine.runTestsInBranch(Engine.scala:396) at org.scalatest.SuperEngine.runTestsImpl(Engine.scala:475) at org.scalatest.funsuite.AnyFunSuiteLike.runTests(AnyFunSuiteLike.scala:269) at org.scalatest.funsuite.AnyFunSuiteLike.runTests$(AnyFunSuiteLike.scala:268) at org.scalatest.funsuite.AnyFunSuite.runTests(AnyFunSuite.scala:1563) at org.scalatest.Suite.run(Suite.scala:1112) at org.scalatest.Suite.run$(Suite.scala:1094) at org.scalatest.funsuite.AnyFunSuite.org$scalatest$funsuite$AnyFunSuiteLike$$super$run(AnyFunSuite.scala:1563) at org.scalatest.funsuite.AnyFunSuiteLike.$anonfun$run$1(AnyFunSuiteLike.scala:273) at org.scalatest.SuperEngine.runImpl(Engine.scala:535) at
[jira] [Commented] (SPARK-33223) Expose state information on SS UI
[ https://issues.apache.org/jira/browse/SPARK-33223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17338901#comment-17338901 ] Gabor Somogyi commented on SPARK-33223: --- [~smilegator] sure, filed SPARK-35311 and preparing a PR. > Expose state information on SS UI > - > > Key: SPARK-33223 > URL: https://issues.apache.org/jira/browse/SPARK-33223 > Project: Spark > Issue Type: Sub-task > Components: Structured Streaming, Web UI >Affects Versions: 3.0.1 >Reporter: Gabor Somogyi >Assignee: Gabor Somogyi >Priority: Major > Fix For: 3.1.0 > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-35311) Add exposed SS UI state information metrics to the documentation
Gabor Somogyi created SPARK-35311: - Summary: Add exposed SS UI state information metrics to the documentation Key: SPARK-35311 URL: https://issues.apache.org/jira/browse/SPARK-35311 Project: Spark Issue Type: Improvement Components: Structured Streaming Affects Versions: 3.2.0 Reporter: Gabor Somogyi -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Closed] (SPARK-34383) Optimize WAL commit phase on SS
[ https://issues.apache.org/jira/browse/SPARK-34383?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Somogyi closed SPARK-34383. - > Optimize WAL commit phase on SS > --- > > Key: SPARK-34383 > URL: https://issues.apache.org/jira/browse/SPARK-34383 > Project: Spark > Issue Type: Improvement > Components: Structured Streaming >Affects Versions: 3.2.0 >Reporter: Jungtaek Lim >Assignee: Jungtaek Lim >Priority: Major > Fix For: 3.2.0 > > > I found there're unnecessary access / expensive operation of file system in > WAL commit phase of SS. > They can be optimized via caching (using driver memory a bit) & replacing FS > operation. This brings reduced latency per batch, especially checkpoint > against object store. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-34383) Optimize WAL commit phase on SS
[ https://issues.apache.org/jira/browse/SPARK-34383?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Somogyi resolved SPARK-34383. --- Fix Version/s: 3.2.0 Resolution: Fixed Issue resolved by pull request 31495 https://github.com/apache/spark/pull/31495 > Optimize WAL commit phase on SS > --- > > Key: SPARK-34383 > URL: https://issues.apache.org/jira/browse/SPARK-34383 > Project: Spark > Issue Type: Improvement > Components: Structured Streaming >Affects Versions: 3.2.0 >Reporter: Jungtaek Lim >Assignee: Jungtaek Lim >Priority: Major > Fix For: 3.2.0 > > > I found there're unnecessary access / expensive operation of file system in > WAL commit phase of SS. > They can be optimized via caching (using driver memory a bit) & replacing FS > operation. This brings reduced latency per batch, especially checkpoint > against object store. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-34383) Optimize WAL commit phase on SS
[ https://issues.apache.org/jira/browse/SPARK-34383?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Somogyi reassigned SPARK-34383: - Assignee: Jungtaek Lim > Optimize WAL commit phase on SS > --- > > Key: SPARK-34383 > URL: https://issues.apache.org/jira/browse/SPARK-34383 > Project: Spark > Issue Type: Improvement > Components: Structured Streaming >Affects Versions: 3.2.0 >Reporter: Jungtaek Lim >Assignee: Jungtaek Lim >Priority: Major > > I found there're unnecessary access / expensive operation of file system in > WAL commit phase of SS. > They can be optimized via caching (using driver memory a bit) & replacing FS > operation. This brings reduced latency per batch, especially checkpoint > against object store. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-34580) Provide the relationship between batch ID and SQL executions (and/or Jobs) in SS UI page
Gabor Somogyi created SPARK-34580: - Summary: Provide the relationship between batch ID and SQL executions (and/or Jobs) in SS UI page Key: SPARK-34580 URL: https://issues.apache.org/jira/browse/SPARK-34580 Project: Spark Issue Type: Improvement Components: Structured Streaming Affects Versions: 3.1.1 Reporter: Gabor Somogyi The current SS UI page focuses to show the trends among the batches, which is great to figure out whether the streaming query is running healthy or not, and the oddness of specific batch. One thing still bugging you is that what you can get from here is the batch ID (number), which means you have to find out related SQL executions and Jobs manually with the batch ID. It's high likely bound to the recent runs of SQL executions/Jobs so you may not need to find it with searching on lots of pages, but the fact you need to find it by yourself manually is still annoying. It would be nice if we can provide the relationship between batch ID and SQL executions (probably Jobs as well if the space is enough) and links to these pages, like we see job page links from SQL execution page. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-34580) Provide the relationship between batch ID and SQL executions (and/or Jobs) in SS UI page
[ https://issues.apache.org/jira/browse/SPARK-34580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17292787#comment-17292787 ] Gabor Somogyi commented on SPARK-34580: --- cc [~kabhwan] [~hyukjin.kwon] [~zsxwing] [~viirya] It requires some time to come up with a makes sense solution but started. > Provide the relationship between batch ID and SQL executions (and/or Jobs) in > SS UI page > > > Key: SPARK-34580 > URL: https://issues.apache.org/jira/browse/SPARK-34580 > Project: Spark > Issue Type: Improvement > Components: Structured Streaming >Affects Versions: 3.1.1 >Reporter: Gabor Somogyi >Priority: Major > > The current SS UI page focuses to show the trends among the batches, which is > great to figure out whether the streaming query is running healthy or not, > and the oddness of specific batch. > One thing still bugging you is that what you can get from here is the batch > ID (number), which means you have to find out related SQL executions and Jobs > manually with the batch ID. It's high likely bound to the recent runs of SQL > executions/Jobs so you may not need to find it with searching on lots of > pages, but the fact you need to find it by yourself manually is still > annoying. > It would be nice if we can provide the relationship between batch ID and SQL > executions (probably Jobs as well if the space is enough) and links to these > pages, like we see job page links from SQL execution page. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-34497) JDBC connection provider is not removing kerberos credentials from JVM security context
[ https://issues.apache.org/jira/browse/SPARK-34497?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Somogyi updated SPARK-34497: -- Description: Some of the built-in JDBC connection providers are changing the JVM security context to do the authentication which is fine. The problematic part is that executors can be reused by another query. The following situation leads to incorrect behaviour: * Query1 opens JDBC connection and changes JVM security context in Executor1 * Query2 tries to open JDBC connection but it realizes there is already an entry for that DB type in Executor1 * Query2 is not changing JVM security context and uses Query1 keytab and principal * Query2 fails with authentication error > JDBC connection provider is not removing kerberos credentials from JVM > security context > --- > > Key: SPARK-34497 > URL: https://issues.apache.org/jira/browse/SPARK-34497 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.0.2, 3.2.0, 3.1.2 >Reporter: Gabor Somogyi >Priority: Major > > Some of the built-in JDBC connection providers are changing the JVM security > context to do the authentication which is fine. The problematic part is that > executors can be reused by another query. The following situation leads to > incorrect behaviour: > * Query1 opens JDBC connection and changes JVM security context in Executor1 > * Query2 tries to open JDBC connection but it realizes there is already an > entry for that DB type in Executor1 > * Query2 is not changing JVM security context and uses Query1 keytab and > principal > * Query2 fails with authentication error -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-34497) JDBC connection provider is not removing kerberos credentials from JVM security context
[ https://issues.apache.org/jira/browse/SPARK-34497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17289815#comment-17289815 ] Gabor Somogyi commented on SPARK-34497: --- Thanks for the suggestion, filled. > JDBC connection provider is not removing kerberos credentials from JVM > security context > --- > > Key: SPARK-34497 > URL: https://issues.apache.org/jira/browse/SPARK-34497 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.0.2, 3.2.0, 3.1.2 >Reporter: Gabor Somogyi >Priority: Major > > Some of the built-in JDBC connection providers are changing the JVM security > context to do the authentication which is fine. The problematic part is that > executors can be reused by another query. The following situation leads to > incorrect behaviour: > * Query1 opens JDBC connection and changes JVM security context in Executor1 > * Query2 tries to open JDBC connection but it realizes there is already an > entry for that DB type in Executor1 > * Query2 is not changing JVM security context and uses Query1 keytab and > principal > * Query2 fails with authentication error -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-34497) JDBC connection provider is not removing kerberos credentials from JVM security context
[ https://issues.apache.org/jira/browse/SPARK-34497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17288387#comment-17288387 ] Gabor Somogyi commented on SPARK-34497: --- Working on this. > JDBC connection provider is not removing kerberos credentials from JVM > security context > --- > > Key: SPARK-34497 > URL: https://issues.apache.org/jira/browse/SPARK-34497 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.0.2, 3.1.0 >Reporter: Gabor Somogyi >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-34497) JDBC connection provider is not removing kerberos credentials from JVM security context
Gabor Somogyi created SPARK-34497: - Summary: JDBC connection provider is not removing kerberos credentials from JVM security context Key: SPARK-34497 URL: https://issues.apache.org/jira/browse/SPARK-34497 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 3.0.2, 3.1.0 Reporter: Gabor Somogyi -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Closed] (SPARK-12312) Support JDBC Kerberos w/ keytab
[ https://issues.apache.org/jira/browse/SPARK-12312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Somogyi closed SPARK-12312. - > Support JDBC Kerberos w/ keytab > --- > > Key: SPARK-12312 > URL: https://issues.apache.org/jira/browse/SPARK-12312 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 1.5.2, 2.4.2 >Reporter: nabacg >Assignee: Gabor Somogyi >Priority: Minor > Fix For: 3.1.0 > > > When loading DataFrames from JDBC datasource with Kerberos authentication, > remote executors (yarn-client/cluster etc. modes) fail to establish a > connection due to lack of Kerberos ticket or ability to generate it. > This is a real issue when trying to ingest data from kerberized data sources > (SQL Server, Oracle) in enterprise environment where exposing simple > authentication access is not an option due to IT policy issues. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-12312) Support JDBC Kerberos w/ keytab
[ https://issues.apache.org/jira/browse/SPARK-12312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Somogyi resolved SPARK-12312. --- Fix Version/s: 3.1.0 Resolution: Fixed > Support JDBC Kerberos w/ keytab > --- > > Key: SPARK-12312 > URL: https://issues.apache.org/jira/browse/SPARK-12312 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 1.5.2, 2.4.2 >Reporter: nabacg >Assignee: Gabor Somogyi >Priority: Minor > Fix For: 3.1.0 > > > When loading DataFrames from JDBC datasource with Kerberos authentication, > remote executors (yarn-client/cluster etc. modes) fail to establish a > connection due to lack of Kerberos ticket or ability to generate it. > This is a real issue when trying to ingest data from kerberized data sources > (SQL Server, Oracle) in enterprise environment where exposing simple > authentication access is not an option due to IT policy issues. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Closed] (SPARK-31857) Support Azure SQLDB Kerberos login in JDBC connector
[ https://issues.apache.org/jira/browse/SPARK-31857?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Somogyi closed SPARK-31857. - > Support Azure SQLDB Kerberos login in JDBC connector > > > Key: SPARK-31857 > URL: https://issues.apache.org/jira/browse/SPARK-31857 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.1.0 >Reporter: jobit mathew >Priority: Minor > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-31884) Support MongoDB Kerberos login in JDBC connector
[ https://issues.apache.org/jira/browse/SPARK-31884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Somogyi resolved SPARK-31884. --- Resolution: Won't Do > Support MongoDB Kerberos login in JDBC connector > > > Key: SPARK-31884 > URL: https://issues.apache.org/jira/browse/SPARK-31884 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.1.0 >Reporter: jobit mathew >Priority: Minor > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Closed] (SPARK-31884) Support MongoDB Kerberos login in JDBC connector
[ https://issues.apache.org/jira/browse/SPARK-31884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Somogyi closed SPARK-31884. - > Support MongoDB Kerberos login in JDBC connector > > > Key: SPARK-31884 > URL: https://issues.apache.org/jira/browse/SPARK-31884 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.1.0 >Reporter: jobit mathew >Priority: Minor > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-31884) Support MongoDB Kerberos login in JDBC connector
[ https://issues.apache.org/jira/browse/SPARK-31884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17282280#comment-17282280 ] Gabor Somogyi commented on SPARK-31884: --- Since the API is implemented I think it's now possible to add the mentioned feature as external plugin so closing this jira. If committers or PMCs think we can add this as built-in provider feel free to re-open this Jira. > Support MongoDB Kerberos login in JDBC connector > > > Key: SPARK-31884 > URL: https://issues.apache.org/jira/browse/SPARK-31884 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.1.0 >Reporter: jobit mathew >Priority: Minor > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-31857) Support Azure SQLDB Kerberos login in JDBC connector
[ https://issues.apache.org/jira/browse/SPARK-31857?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Somogyi resolved SPARK-31857. --- Resolution: Won't Do > Support Azure SQLDB Kerberos login in JDBC connector > > > Key: SPARK-31857 > URL: https://issues.apache.org/jira/browse/SPARK-31857 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.1.0 >Reporter: jobit mathew >Priority: Minor > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Closed] (SPARK-31815) Support Hive Kerberos login in JDBC connector
[ https://issues.apache.org/jira/browse/SPARK-31815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Somogyi closed SPARK-31815. - > Support Hive Kerberos login in JDBC connector > - > > Key: SPARK-31815 > URL: https://issues.apache.org/jira/browse/SPARK-31815 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.1.0 >Reporter: Gabor Somogyi >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-31857) Support Azure SQLDB Kerberos login in JDBC connector
[ https://issues.apache.org/jira/browse/SPARK-31857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17282279#comment-17282279 ] Gabor Somogyi commented on SPARK-31857: --- Since the API is implemented I think it's now possible to add the mentioned feature as external plugin so closing this jira. If committers or PMCs think we can add this as built-in provider feel free to re-open this Jira. > Support Azure SQLDB Kerberos login in JDBC connector > > > Key: SPARK-31857 > URL: https://issues.apache.org/jira/browse/SPARK-31857 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.1.0 >Reporter: jobit mathew >Priority: Minor > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-31815) Support Hive Kerberos login in JDBC connector
[ https://issues.apache.org/jira/browse/SPARK-31815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Somogyi resolved SPARK-31815. --- Resolution: Won't Do > Support Hive Kerberos login in JDBC connector > - > > Key: SPARK-31815 > URL: https://issues.apache.org/jira/browse/SPARK-31815 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.1.0 >Reporter: Gabor Somogyi >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-31815) Support Hive Kerberos login in JDBC connector
[ https://issues.apache.org/jira/browse/SPARK-31815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17282277#comment-17282277 ] Gabor Somogyi commented on SPARK-31815: --- Since the API is implemented I think it's now possible to add the mentioned feature as external plugin so closing this jira. If committers or PMCs think we can add this as built-in provider feel free to re-open this Jira. > Support Hive Kerberos login in JDBC connector > - > > Key: SPARK-31815 > URL: https://issues.apache.org/jira/browse/SPARK-31815 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.1.0 >Reporter: Gabor Somogyi >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-34198) Add RocksDB StateStore as external module
[ https://issues.apache.org/jira/browse/SPARK-34198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17271308#comment-17271308 ] Gabor Somogyi commented on SPARK-34198: --- +1 on this, I've started to review it back in the days when a guy wanted to add it embedded. > Add RocksDB StateStore as external module > - > > Key: SPARK-34198 > URL: https://issues.apache.org/jira/browse/SPARK-34198 > Project: Spark > Issue Type: New Feature > Components: Structured Streaming >Affects Versions: 3.2.0 >Reporter: L. C. Hsieh >Priority: Major > > Currently Spark SS only has one built-in StateStore implementation > HDFSBackedStateStore. Actually it uses in-memory map to store state rows. As > there are more and more streaming applications, some of them requires to use > large state in stateful operations such as streaming aggregation and join. > Several other major streaming frameworks already use RocksDB for state > management. So it is proven to be good choice for large state usage. But > Spark SS still lacks of a built-in state store for the requirement. > We would like to explore the possibility to add RocksDB-based StateStore into > Spark SS. For the concern about adding RocksDB as a direct dependency, our > plan is to add this StateStore as an external module first. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-34090) HadoopDelegationTokenManager.isServiceEnabled used in KafkaTokenUtil.needTokenUpdate needs to be cached because it slows down Kafka stream processing in case of delegati
Gabor Somogyi created SPARK-34090: - Summary: HadoopDelegationTokenManager.isServiceEnabled used in KafkaTokenUtil.needTokenUpdate needs to be cached because it slows down Kafka stream processing in case of delegation token Key: SPARK-34090 URL: https://issues.apache.org/jira/browse/SPARK-34090 Project: Spark Issue Type: Bug Components: Structured Streaming Affects Versions: 3.1.1 Reporter: Gabor Somogyi -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-34032) Add Kafka delegation token truststore and keystore type confiuration
Gabor Somogyi created SPARK-34032: - Summary: Add Kafka delegation token truststore and keystore type confiuration Key: SPARK-34032 URL: https://issues.apache.org/jira/browse/SPARK-34032 Project: Spark Issue Type: Bug Components: Structured Streaming Affects Versions: 3.0.1 Reporter: Gabor Somogyi -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-33635) Performance regression in Kafka read
[ https://issues.apache.org/jira/browse/SPARK-33635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17252153#comment-17252153 ] Gabor Somogyi commented on SPARK-33635: --- {quote}Remember, based on all my testing, and raw kafka reads on my system - the 3.0.1 spark is performing in line with expectations.{quote} Good to hear. You don't have to hurry since I'm on vacation this year unless a breaking issue appears in the upcoming Spark release. > Performance regression in Kafka read > > > Key: SPARK-33635 > URL: https://issues.apache.org/jira/browse/SPARK-33635 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.0.0, 3.0.1 > Environment: A simple 5 node system. A simple data row of csv data in > kafka, evenly distributed between the partitions. > Open JDK 1.8.0.252 > Spark in stand alone - 5 nodes, 10 workers (2 worker per node, each locked to > a distinct NUMA group) > kafka (v 2.3.1) cluster - 5 nodes (1 broker per node). > Centos 7.7.1908 > 1 topic, 10 partiions, 1 hour queue life > (this is just one of clusters we have, I have tested on all of them and > theyall exhibit the same performance degredation) >Reporter: David Wyles >Priority: Major > > I have observed a slowdown in the reading of data from kafka on all of our > systems when migrating from spark 2.4.5 to Spark 3.0.0 (and Spark 3.0.1) > I have created a sample project to isolate the problem as much as possible, > with just a read all data from a kafka topic (see > [https://github.com/codegorillauk/spark-kafka-read] ). > With 2.4.5, across multiple runs, > I get a stable read rate of 1,120,000 (1.12 mill) rows per second > With 3.0.0 or 3.0.1, across multiple runs, > I get a stable read rate of 632,000 (0.632 mil) rows per second > The represents a *44% loss in performance*. Which is, a lot. > I have been working though the spark-sql-kafka-0-10 code base, but change for > spark 3 have been ongoing for over a year and its difficult to pin point an > exact change or reason for the degradation. > I am happy to help fix this problem, but will need some assitance as I am > unfamiliar with the spark-sql-kafka-0-10 project. > > A sample of the data my test reads (note: its not parsing csv - this is just > test data) > > 160692180,001e0610e532,lightsense,tsl250rd,intensity,21853,53.262,acceleration_z,651,ep,290,commit,913,pressure,138,pm1,799,uv_intensity,823,idletime,-372,count,-72,ir_intensity,185,concentration,-61,flags,-532,tx,694.36,ep_heatsink,-556.92,acceleration_x,-221.40,fw,910.53,sample_flow_rate,-959.60,uptime,-515.15,pm10,-768.03,powersupply,214.72,magnetic_field_y,-616.04,alphasense,606.73,AoT_Chicago,053,Racine > Ave & 18th St Chicago IL,41.857959,-87.6564270002,AoT Chicago (S) > [C],2017/12/15 00:00:00, -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-33635) Performance regression in Kafka read
[ https://issues.apache.org/jira/browse/SPARK-33635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17248550#comment-17248550 ] Gabor Somogyi commented on SPARK-33635: --- {quote}The collect in this test case is only 13 items of data after the group by - so I know thats not going to impact it. But I can modify it to just read and write to kafka. {quote} Yeah, we need to reduce the use-case to the most minimal app to measure only what we need. Aggregations and all those stuff don't belong to Kafka read and write performance. > Performance regression in Kafka read > > > Key: SPARK-33635 > URL: https://issues.apache.org/jira/browse/SPARK-33635 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.0.0, 3.0.1 > Environment: A simple 5 node system. A simple data row of csv data in > kafka, evenly distributed between the partitions. > Open JDK 1.8.0.252 > Spark in stand alone - 5 nodes, 10 workers (2 worker per node, each locked to > a distinct NUMA group) > kafka (v 2.3.1) cluster - 5 nodes (1 broker per node). > Centos 7.7.1908 > 1 topic, 10 partiions, 1 hour queue life > (this is just one of clusters we have, I have tested on all of them and > theyall exhibit the same performance degredation) >Reporter: David Wyles >Priority: Major > > I have observed a slowdown in the reading of data from kafka on all of our > systems when migrating from spark 2.4.5 to Spark 3.0.0 (and Spark 3.0.1) > I have created a sample project to isolate the problem as much as possible, > with just a read all data from a kafka topic (see > [https://github.com/codegorillauk/spark-kafka-read] ). > With 2.4.5, across multiple runs, > I get a stable read rate of 1,120,000 (1.12 mill) rows per second > With 3.0.0 or 3.0.1, across multiple runs, > I get a stable read rate of 632,000 (0.632 mil) rows per second > The represents a *44% loss in performance*. Which is, a lot. > I have been working though the spark-sql-kafka-0-10 code base, but change for > spark 3 have been ongoing for over a year and its difficult to pin point an > exact change or reason for the degradation. > I am happy to help fix this problem, but will need some assitance as I am > unfamiliar with the spark-sql-kafka-0-10 project. > > A sample of the data my test reads (note: its not parsing csv - this is just > test data) > > 160692180,001e0610e532,lightsense,tsl250rd,intensity,21853,53.262,acceleration_z,651,ep,290,commit,913,pressure,138,pm1,799,uv_intensity,823,idletime,-372,count,-72,ir_intensity,185,concentration,-61,flags,-532,tx,694.36,ep_heatsink,-556.92,acceleration_x,-221.40,fw,910.53,sample_flow_rate,-959.60,uptime,-515.15,pm10,-768.03,powersupply,214.72,magnetic_field_y,-616.04,alphasense,606.73,AoT_Chicago,053,Racine > Ave & 18th St Chicago IL,41.857959,-87.6564270002,AoT Chicago (S) > [C],2017/12/15 00:00:00, -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-33635) Performance regression in Kafka read
[ https://issues.apache.org/jira/browse/SPARK-33635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17248549#comment-17248549 ] Gabor Somogyi commented on SPARK-33635: --- Mixed up with DStreams, in Strutured Streaming and SQL there is no turn off flag. > Performance regression in Kafka read > > > Key: SPARK-33635 > URL: https://issues.apache.org/jira/browse/SPARK-33635 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.0.0, 3.0.1 > Environment: A simple 5 node system. A simple data row of csv data in > kafka, evenly distributed between the partitions. > Open JDK 1.8.0.252 > Spark in stand alone - 5 nodes, 10 workers (2 worker per node, each locked to > a distinct NUMA group) > kafka (v 2.3.1) cluster - 5 nodes (1 broker per node). > Centos 7.7.1908 > 1 topic, 10 partiions, 1 hour queue life > (this is just one of clusters we have, I have tested on all of them and > theyall exhibit the same performance degredation) >Reporter: David Wyles >Priority: Major > > I have observed a slowdown in the reading of data from kafka on all of our > systems when migrating from spark 2.4.5 to Spark 3.0.0 (and Spark 3.0.1) > I have created a sample project to isolate the problem as much as possible, > with just a read all data from a kafka topic (see > [https://github.com/codegorillauk/spark-kafka-read] ). > With 2.4.5, across multiple runs, > I get a stable read rate of 1,120,000 (1.12 mill) rows per second > With 3.0.0 or 3.0.1, across multiple runs, > I get a stable read rate of 632,000 (0.632 mil) rows per second > The represents a *44% loss in performance*. Which is, a lot. > I have been working though the spark-sql-kafka-0-10 code base, but change for > spark 3 have been ongoing for over a year and its difficult to pin point an > exact change or reason for the degradation. > I am happy to help fix this problem, but will need some assitance as I am > unfamiliar with the spark-sql-kafka-0-10 project. > > A sample of the data my test reads (note: its not parsing csv - this is just > test data) > > 160692180,001e0610e532,lightsense,tsl250rd,intensity,21853,53.262,acceleration_z,651,ep,290,commit,913,pressure,138,pm1,799,uv_intensity,823,idletime,-372,count,-72,ir_intensity,185,concentration,-61,flags,-532,tx,694.36,ep_heatsink,-556.92,acceleration_x,-221.40,fw,910.53,sample_flow_rate,-959.60,uptime,-515.15,pm10,-768.03,powersupply,214.72,magnetic_field_y,-616.04,alphasense,606.73,AoT_Chicago,053,Racine > Ave & 18th St Chicago IL,41.857959,-87.6564270002,AoT Chicago (S) > [C],2017/12/15 00:00:00, -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-28367) Kafka connector infinite wait because metadata never updated
[ https://issues.apache.org/jira/browse/SPARK-28367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Somogyi updated SPARK-28367: -- Fix Version/s: 3.1.0 > Kafka connector infinite wait because metadata never updated > > > Key: SPARK-28367 > URL: https://issues.apache.org/jira/browse/SPARK-28367 > Project: Spark > Issue Type: Bug > Components: Structured Streaming >Affects Versions: 2.1.3, 2.2.3, 2.3.3, 2.4.3, 3.0.0, 3.1.0 >Reporter: Gabor Somogyi >Priority: Critical > Fix For: 3.1.0 > > > Spark uses an old and deprecated API named poll(long) which never returns and > stays in live lock if metadata is not updated (for instance when broker > disappears at consumer creation). -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Closed] (SPARK-28367) Kafka connector infinite wait because metadata never updated
[ https://issues.apache.org/jira/browse/SPARK-28367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Somogyi closed SPARK-28367. - > Kafka connector infinite wait because metadata never updated > > > Key: SPARK-28367 > URL: https://issues.apache.org/jira/browse/SPARK-28367 > Project: Spark > Issue Type: Bug > Components: Structured Streaming >Affects Versions: 2.1.3, 2.2.3, 2.3.3, 2.4.3, 3.0.0, 3.1.0 >Reporter: Gabor Somogyi >Priority: Critical > > Spark uses an old and deprecated API named poll(long) which never returns and > stays in live lock if metadata is not updated (for instance when broker > disappears at consumer creation). -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-28367) Kafka connector infinite wait because metadata never updated
[ https://issues.apache.org/jira/browse/SPARK-28367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Somogyi resolved SPARK-28367. --- Resolution: Fixed > Kafka connector infinite wait because metadata never updated > > > Key: SPARK-28367 > URL: https://issues.apache.org/jira/browse/SPARK-28367 > Project: Spark > Issue Type: Bug > Components: Structured Streaming >Affects Versions: 2.1.3, 2.2.3, 2.3.3, 2.4.3, 3.0.0, 3.1.0 >Reporter: Gabor Somogyi >Priority: Critical > > Spark uses an old and deprecated API named poll(long) which never returns and > stays in live lock if metadata is not updated (for instance when broker > disappears at consumer creation). -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-28367) Kafka connector infinite wait because metadata never updated
[ https://issues.apache.org/jira/browse/SPARK-28367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17247813#comment-17247813 ] Gabor Somogyi commented on SPARK-28367: --- The issue solved in subtasks so closing this. > Kafka connector infinite wait because metadata never updated > > > Key: SPARK-28367 > URL: https://issues.apache.org/jira/browse/SPARK-28367 > Project: Spark > Issue Type: Bug > Components: Structured Streaming >Affects Versions: 2.1.3, 2.2.3, 2.3.3, 2.4.3, 3.0.0, 3.1.0 >Reporter: Gabor Somogyi >Priority: Critical > > Spark uses an old and deprecated API named poll(long) which never returns and > stays in live lock if metadata is not updated (for instance when broker > disappears at consumer creation). -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-33635) Performance regression in Kafka read
[ https://issues.apache.org/jira/browse/SPARK-33635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17246643#comment-17246643 ] Gabor Somogyi edited comment on SPARK-33635 at 12/9/20, 4:39 PM: - {quote}I no longer believe this is a true regression in performance, I now think that 2.4.5 was "cheating". {quote} If you mean by cheating Spark uses one consumer from multiple threads then the answer is no. Kafka consumer is strictly forbidden to use from multiple threads. If such thing happens then Kafka realizes it and exception will be thrown which will stop the query immediately. was (Author: gsomogyi): {quote}I no longer believe this is a true regression in performance, I now think that 2.4.5 was "cheating". {quote} If you mean by cheating Spark uses one consumer from multiple threads then the answer is no. Kafka consumer is strictly forbidden to use from multiple threads. If such thing happens then Kafka realizes it and exception will be throws which will stop the query immediately. > Performance regression in Kafka read > > > Key: SPARK-33635 > URL: https://issues.apache.org/jira/browse/SPARK-33635 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.0.0, 3.0.1 > Environment: A simple 5 node system. A simple data row of csv data in > kafka, evenly distributed between the partitions. > Open JDK 1.8.0.252 > Spark in stand alone - 5 nodes, 10 workers (2 worker per node, each locked to > a distinct NUMA group) > kafka (v 2.3.1) cluster - 5 nodes (1 broker per node). > Centos 7.7.1908 > 1 topic, 10 partiions, 1 hour queue life > (this is just one of clusters we have, I have tested on all of them and > theyall exhibit the same performance degredation) >Reporter: David Wyles >Priority: Major > > I have observed a slowdown in the reading of data from kafka on all of our > systems when migrating from spark 2.4.5 to Spark 3.0.0 (and Spark 3.0.1) > I have created a sample project to isolate the problem as much as possible, > with just a read all data from a kafka topic (see > [https://github.com/codegorillauk/spark-kafka-read] ). > With 2.4.5, across multiple runs, > I get a stable read rate of 1,120,000 (1.12 mill) rows per second > With 3.0.0 or 3.0.1, across multiple runs, > I get a stable read rate of 632,000 (0.632 mil) rows per second > The represents a *44% loss in performance*. Which is, a lot. > I have been working though the spark-sql-kafka-0-10 code base, but change for > spark 3 have been ongoing for over a year and its difficult to pin point an > exact change or reason for the degradation. > I am happy to help fix this problem, but will need some assitance as I am > unfamiliar with the spark-sql-kafka-0-10 project. > > A sample of the data my test reads (note: its not parsing csv - this is just > test data) > > 160692180,001e0610e532,lightsense,tsl250rd,intensity,21853,53.262,acceleration_z,651,ep,290,commit,913,pressure,138,pm1,799,uv_intensity,823,idletime,-372,count,-72,ir_intensity,185,concentration,-61,flags,-532,tx,694.36,ep_heatsink,-556.92,acceleration_x,-221.40,fw,910.53,sample_flow_rate,-959.60,uptime,-515.15,pm10,-768.03,powersupply,214.72,magnetic_field_y,-616.04,alphasense,606.73,AoT_Chicago,053,Racine > Ave & 18th St Chicago IL,41.857959,-87.6564270002,AoT Chicago (S) > [C],2017/12/15 00:00:00, -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-33635) Performance regression in Kafka read
[ https://issues.apache.org/jira/browse/SPARK-33635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17246643#comment-17246643 ] Gabor Somogyi commented on SPARK-33635: --- {quote}I no longer believe this is a true regression in performance, I now think that 2.4.5 was "cheating". {quote} If you mean by cheating Spark uses one consumer from multiple threads then the answer is no. Kafka consumer is strictly forbidden to use from multiple threads. If such thing happens then Kafka realizes it and exception will be throws which will stop the query immediately. > Performance regression in Kafka read > > > Key: SPARK-33635 > URL: https://issues.apache.org/jira/browse/SPARK-33635 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.0.0, 3.0.1 > Environment: A simple 5 node system. A simple data row of csv data in > kafka, evenly distributed between the partitions. > Open JDK 1.8.0.252 > Spark in stand alone - 5 nodes, 10 workers (2 worker per node, each locked to > a distinct NUMA group) > kafka (v 2.3.1) cluster - 5 nodes (1 broker per node). > Centos 7.7.1908 > 1 topic, 10 partiions, 1 hour queue life > (this is just one of clusters we have, I have tested on all of them and > theyall exhibit the same performance degredation) >Reporter: David Wyles >Priority: Major > > I have observed a slowdown in the reading of data from kafka on all of our > systems when migrating from spark 2.4.5 to Spark 3.0.0 (and Spark 3.0.1) > I have created a sample project to isolate the problem as much as possible, > with just a read all data from a kafka topic (see > [https://github.com/codegorillauk/spark-kafka-read] ). > With 2.4.5, across multiple runs, > I get a stable read rate of 1,120,000 (1.12 mill) rows per second > With 3.0.0 or 3.0.1, across multiple runs, > I get a stable read rate of 632,000 (0.632 mil) rows per second > The represents a *44% loss in performance*. Which is, a lot. > I have been working though the spark-sql-kafka-0-10 code base, but change for > spark 3 have been ongoing for over a year and its difficult to pin point an > exact change or reason for the degradation. > I am happy to help fix this problem, but will need some assitance as I am > unfamiliar with the spark-sql-kafka-0-10 project. > > A sample of the data my test reads (note: its not parsing csv - this is just > test data) > > 160692180,001e0610e532,lightsense,tsl250rd,intensity,21853,53.262,acceleration_z,651,ep,290,commit,913,pressure,138,pm1,799,uv_intensity,823,idletime,-372,count,-72,ir_intensity,185,concentration,-61,flags,-532,tx,694.36,ep_heatsink,-556.92,acceleration_x,-221.40,fw,910.53,sample_flow_rate,-959.60,uptime,-515.15,pm10,-768.03,powersupply,214.72,magnetic_field_y,-616.04,alphasense,606.73,AoT_Chicago,053,Racine > Ave & 18th St Chicago IL,41.857959,-87.6564270002,AoT Chicago (S) > [C],2017/12/15 00:00:00, -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-33635) Performance regression in Kafka read
[ https://issues.apache.org/jira/browse/SPARK-33635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17246640#comment-17246640 ] Gabor Somogyi commented on SPARK-33635: --- I've changed to SQL because you're not executing a Structured Streaming query but an SQL batch. > Performance regression in Kafka read > > > Key: SPARK-33635 > URL: https://issues.apache.org/jira/browse/SPARK-33635 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.0.0, 3.0.1 > Environment: A simple 5 node system. A simple data row of csv data in > kafka, evenly distributed between the partitions. > Open JDK 1.8.0.252 > Spark in stand alone - 5 nodes, 10 workers (2 worker per node, each locked to > a distinct NUMA group) > kafka (v 2.3.1) cluster - 5 nodes (1 broker per node). > Centos 7.7.1908 > 1 topic, 10 partiions, 1 hour queue life > (this is just one of clusters we have, I have tested on all of them and > theyall exhibit the same performance degredation) >Reporter: David Wyles >Priority: Major > > I have observed a slowdown in the reading of data from kafka on all of our > systems when migrating from spark 2.4.5 to Spark 3.0.0 (and Spark 3.0.1) > I have created a sample project to isolate the problem as much as possible, > with just a read all data from a kafka topic (see > [https://github.com/codegorillauk/spark-kafka-read] ). > With 2.4.5, across multiple runs, > I get a stable read rate of 1,120,000 (1.12 mill) rows per second > With 3.0.0 or 3.0.1, across multiple runs, > I get a stable read rate of 632,000 (0.632 mil) rows per second > The represents a *44% loss in performance*. Which is, a lot. > I have been working though the spark-sql-kafka-0-10 code base, but change for > spark 3 have been ongoing for over a year and its difficult to pin point an > exact change or reason for the degradation. > I am happy to help fix this problem, but will need some assitance as I am > unfamiliar with the spark-sql-kafka-0-10 project. > > A sample of the data my test reads (note: its not parsing csv - this is just > test data) > > 160692180,001e0610e532,lightsense,tsl250rd,intensity,21853,53.262,acceleration_z,651,ep,290,commit,913,pressure,138,pm1,799,uv_intensity,823,idletime,-372,count,-72,ir_intensity,185,concentration,-61,flags,-532,tx,694.36,ep_heatsink,-556.92,acceleration_x,-221.40,fw,910.53,sample_flow_rate,-959.60,uptime,-515.15,pm10,-768.03,powersupply,214.72,magnetic_field_y,-616.04,alphasense,606.73,AoT_Chicago,053,Racine > Ave & 18th St Chicago IL,41.857959,-87.6564270002,AoT Chicago (S) > [C],2017/12/15 00:00:00, -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-33635) Performance regression in Kafka read
[ https://issues.apache.org/jira/browse/SPARK-33635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Somogyi updated SPARK-33635: -- Component/s: (was: Structured Streaming) SQL > Performance regression in Kafka read > > > Key: SPARK-33635 > URL: https://issues.apache.org/jira/browse/SPARK-33635 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.0.0, 3.0.1 > Environment: A simple 5 node system. A simple data row of csv data in > kafka, evenly distributed between the partitions. > Open JDK 1.8.0.252 > Spark in stand alone - 5 nodes, 10 workers (2 worker per node, each locked to > a distinct NUMA group) > kafka (v 2.3.1) cluster - 5 nodes (1 broker per node). > Centos 7.7.1908 > 1 topic, 10 partiions, 1 hour queue life > (this is just one of clusters we have, I have tested on all of them and > theyall exhibit the same performance degredation) >Reporter: David Wyles >Priority: Major > > I have observed a slowdown in the reading of data from kafka on all of our > systems when migrating from spark 2.4.5 to Spark 3.0.0 (and Spark 3.0.1) > I have created a sample project to isolate the problem as much as possible, > with just a read all data from a kafka topic (see > [https://github.com/codegorillauk/spark-kafka-read] ). > With 2.4.5, across multiple runs, > I get a stable read rate of 1,120,000 (1.12 mill) rows per second > With 3.0.0 or 3.0.1, across multiple runs, > I get a stable read rate of 632,000 (0.632 mil) rows per second > The represents a *44% loss in performance*. Which is, a lot. > I have been working though the spark-sql-kafka-0-10 code base, but change for > spark 3 have been ongoing for over a year and its difficult to pin point an > exact change or reason for the degradation. > I am happy to help fix this problem, but will need some assitance as I am > unfamiliar with the spark-sql-kafka-0-10 project. > > A sample of the data my test reads (note: its not parsing csv - this is just > test data) > > 160692180,001e0610e532,lightsense,tsl250rd,intensity,21853,53.262,acceleration_z,651,ep,290,commit,913,pressure,138,pm1,799,uv_intensity,823,idletime,-372,count,-72,ir_intensity,185,concentration,-61,flags,-532,tx,694.36,ep_heatsink,-556.92,acceleration_x,-221.40,fw,910.53,sample_flow_rate,-959.60,uptime,-515.15,pm10,-768.03,powersupply,214.72,magnetic_field_y,-616.04,alphasense,606.73,AoT_Chicago,053,Racine > Ave & 18th St Chicago IL,41.857959,-87.6564270002,AoT Chicago (S) > [C],2017/12/15 00:00:00, -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-33635) Performance regression in Kafka read
[ https://issues.apache.org/jira/browse/SPARK-33635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17246633#comment-17246633 ] Gabor Somogyi commented on SPARK-33635: --- Since you're measuring speed I've ported the Kafka source from DSv1 to DSv2. DSv1 is the default but the DSv2 can be tried out by setting "spark.sql.sources.useV1SourceList" properly. If you can try it out I would appreciate it. > Performance regression in Kafka read > > > Key: SPARK-33635 > URL: https://issues.apache.org/jira/browse/SPARK-33635 > Project: Spark > Issue Type: Bug > Components: Structured Streaming >Affects Versions: 3.0.0, 3.0.1 > Environment: A simple 5 node system. A simple data row of csv data in > kafka, evenly distributed between the partitions. > Open JDK 1.8.0.252 > Spark in stand alone - 5 nodes, 10 workers (2 worker per node, each locked to > a distinct NUMA group) > kafka (v 2.3.1) cluster - 5 nodes (1 broker per node). > Centos 7.7.1908 > 1 topic, 10 partiions, 1 hour queue life > (this is just one of clusters we have, I have tested on all of them and > theyall exhibit the same performance degredation) >Reporter: David Wyles >Priority: Major > > I have observed a slowdown in the reading of data from kafka on all of our > systems when migrating from spark 2.4.5 to Spark 3.0.0 (and Spark 3.0.1) > I have created a sample project to isolate the problem as much as possible, > with just a read all data from a kafka topic (see > [https://github.com/codegorillauk/spark-kafka-read] ). > With 2.4.5, across multiple runs, > I get a stable read rate of 1,120,000 (1.12 mill) rows per second > With 3.0.0 or 3.0.1, across multiple runs, > I get a stable read rate of 632,000 (0.632 mil) rows per second > The represents a *44% loss in performance*. Which is, a lot. > I have been working though the spark-sql-kafka-0-10 code base, but change for > spark 3 have been ongoing for over a year and its difficult to pin point an > exact change or reason for the degradation. > I am happy to help fix this problem, but will need some assitance as I am > unfamiliar with the spark-sql-kafka-0-10 project. > > A sample of the data my test reads (note: its not parsing csv - this is just > test data) > > 160692180,001e0610e532,lightsense,tsl250rd,intensity,21853,53.262,acceleration_z,651,ep,290,commit,913,pressure,138,pm1,799,uv_intensity,823,idletime,-372,count,-72,ir_intensity,185,concentration,-61,flags,-532,tx,694.36,ep_heatsink,-556.92,acceleration_x,-221.40,fw,910.53,sample_flow_rate,-959.60,uptime,-515.15,pm10,-768.03,powersupply,214.72,magnetic_field_y,-616.04,alphasense,606.73,AoT_Chicago,053,Racine > Ave & 18th St Chicago IL,41.857959,-87.6564270002,AoT Chicago (S) > [C],2017/12/15 00:00:00, -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-33635) Performance regression in Kafka read
[ https://issues.apache.org/jira/browse/SPARK-33635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17246631#comment-17246631 ] Gabor Somogyi commented on SPARK-33635: --- BTW, I'm sure you know but using collect gathers all the data on the driver side which is not really suggested under any circumstances. > Performance regression in Kafka read > > > Key: SPARK-33635 > URL: https://issues.apache.org/jira/browse/SPARK-33635 > Project: Spark > Issue Type: Bug > Components: Structured Streaming >Affects Versions: 3.0.0, 3.0.1 > Environment: A simple 5 node system. A simple data row of csv data in > kafka, evenly distributed between the partitions. > Open JDK 1.8.0.252 > Spark in stand alone - 5 nodes, 10 workers (2 worker per node, each locked to > a distinct NUMA group) > kafka (v 2.3.1) cluster - 5 nodes (1 broker per node). > Centos 7.7.1908 > 1 topic, 10 partiions, 1 hour queue life > (this is just one of clusters we have, I have tested on all of them and > theyall exhibit the same performance degredation) >Reporter: David Wyles >Priority: Major > > I have observed a slowdown in the reading of data from kafka on all of our > systems when migrating from spark 2.4.5 to Spark 3.0.0 (and Spark 3.0.1) > I have created a sample project to isolate the problem as much as possible, > with just a read all data from a kafka topic (see > [https://github.com/codegorillauk/spark-kafka-read] ). > With 2.4.5, across multiple runs, > I get a stable read rate of 1,120,000 (1.12 mill) rows per second > With 3.0.0 or 3.0.1, across multiple runs, > I get a stable read rate of 632,000 (0.632 mil) rows per second > The represents a *44% loss in performance*. Which is, a lot. > I have been working though the spark-sql-kafka-0-10 code base, but change for > spark 3 have been ongoing for over a year and its difficult to pin point an > exact change or reason for the degradation. > I am happy to help fix this problem, but will need some assitance as I am > unfamiliar with the spark-sql-kafka-0-10 project. > > A sample of the data my test reads (note: its not parsing csv - this is just > test data) > > 160692180,001e0610e532,lightsense,tsl250rd,intensity,21853,53.262,acceleration_z,651,ep,290,commit,913,pressure,138,pm1,799,uv_intensity,823,idletime,-372,count,-72,ir_intensity,185,concentration,-61,flags,-532,tx,694.36,ep_heatsink,-556.92,acceleration_x,-221.40,fw,910.53,sample_flow_rate,-959.60,uptime,-515.15,pm10,-768.03,powersupply,214.72,magnetic_field_y,-616.04,alphasense,606.73,AoT_Chicago,053,Racine > Ave & 18th St Chicago IL,41.857959,-87.6564270002,AoT Chicago (S) > [C],2017/12/15 00:00:00, -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-33635) Performance regression in Kafka read
[ https://issues.apache.org/jira/browse/SPARK-33635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17246625#comment-17246625 ] Gabor Somogyi commented on SPARK-33635: --- [~david.wyles] try to turn off Kafka consumer caching. Apart from that there were no super significant changes which could cause this. I've taken a look at your application and it does groupby and stuff like that. This is not related to Kafka read performance since Spark SQL engine contains huge amount of changes. I suggest to create an application which just moves simple data from one topic into another and please use the exact same broker version. If it's still slow we can measure further things. > Performance regression in Kafka read > > > Key: SPARK-33635 > URL: https://issues.apache.org/jira/browse/SPARK-33635 > Project: Spark > Issue Type: Bug > Components: Structured Streaming >Affects Versions: 3.0.0, 3.0.1 > Environment: A simple 5 node system. A simple data row of csv data in > kafka, evenly distributed between the partitions. > Open JDK 1.8.0.252 > Spark in stand alone - 5 nodes, 10 workers (2 worker per node, each locked to > a distinct NUMA group) > kafka (v 2.3.1) cluster - 5 nodes (1 broker per node). > Centos 7.7.1908 > 1 topic, 10 partiions, 1 hour queue life > (this is just one of clusters we have, I have tested on all of them and > theyall exhibit the same performance degredation) >Reporter: David Wyles >Priority: Major > > I have observed a slowdown in the reading of data from kafka on all of our > systems when migrating from spark 2.4.5 to Spark 3.0.0 (and Spark 3.0.1) > I have created a sample project to isolate the problem as much as possible, > with just a read all data from a kafka topic (see > [https://github.com/codegorillauk/spark-kafka-read] ). > With 2.4.5, across multiple runs, > I get a stable read rate of 1,120,000 (1.12 mill) rows per second > With 3.0.0 or 3.0.1, across multiple runs, > I get a stable read rate of 632,000 (0.632 mil) rows per second > The represents a *44% loss in performance*. Which is, a lot. > I have been working though the spark-sql-kafka-0-10 code base, but change for > spark 3 have been ongoing for over a year and its difficult to pin point an > exact change or reason for the degradation. > I am happy to help fix this problem, but will need some assitance as I am > unfamiliar with the spark-sql-kafka-0-10 project. > > A sample of the data my test reads (note: its not parsing csv - this is just > test data) > > 160692180,001e0610e532,lightsense,tsl250rd,intensity,21853,53.262,acceleration_z,651,ep,290,commit,913,pressure,138,pm1,799,uv_intensity,823,idletime,-372,count,-72,ir_intensity,185,concentration,-61,flags,-532,tx,694.36,ep_heatsink,-556.92,acceleration_x,-221.40,fw,910.53,sample_flow_rate,-959.60,uptime,-515.15,pm10,-768.03,powersupply,214.72,magnetic_field_y,-616.04,alphasense,606.73,AoT_Chicago,053,Racine > Ave & 18th St Chicago IL,41.857959,-87.6564270002,AoT Chicago (S) > [C],2017/12/15 00:00:00, -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-32910) Remove UninterruptibleThread usage from KafkaOffsetReader
[ https://issues.apache.org/jira/browse/SPARK-32910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Somogyi updated SPARK-32910: -- Affects Version/s: (was: 3.2.0) 3.1.0 > Remove UninterruptibleThread usage from KafkaOffsetReader > - > > Key: SPARK-32910 > URL: https://issues.apache.org/jira/browse/SPARK-32910 > Project: Spark > Issue Type: Sub-task > Components: Structured Streaming >Affects Versions: 3.1.0 >Reporter: Gabor Somogyi >Priority: Major > > We've talked about this here: > https://github.com/apache/spark/pull/29729#discussion_r488690731 > This jira stands only if the mentioned PR is merged. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-33633) Expose fine grained state information on SS UI
[ https://issues.apache.org/jira/browse/SPARK-33633?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Somogyi updated SPARK-33633: -- Affects Version/s: (was: 3.2.0) 3.1.0 > Expose fine grained state information on SS UI > -- > > Key: SPARK-33633 > URL: https://issues.apache.org/jira/browse/SPARK-33633 > Project: Spark > Issue Type: Sub-task > Components: Structured Streaming >Affects Versions: 3.1.0 >Reporter: Gabor Somogyi >Priority: Major > > SPARK-33223 provides aggregated information but in order to find out the > problematic parts not aggregated information must be also provided. > In or order to find out what is the best way to do that some investigation > needed. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-33633) Expose fine grained state information on SS UI
[ https://issues.apache.org/jira/browse/SPARK-33633?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Somogyi updated SPARK-33633: -- Description: SPARK-33223 provides aggregated information but in order to find out the problematic parts not aggregated information must be also provided. In or order to find out what is the best way to do that some investigation needed. > Expose fine grained state information on SS UI > -- > > Key: SPARK-33633 > URL: https://issues.apache.org/jira/browse/SPARK-33633 > Project: Spark > Issue Type: Sub-task > Components: Structured Streaming >Affects Versions: 3.2.0 >Reporter: Gabor Somogyi >Priority: Major > > SPARK-33223 provides aggregated information but in order to find out the > problematic parts not aggregated information must be also provided. > In or order to find out what is the best way to do that some investigation > needed. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-33633) Expose fine grained state information on SS UI
Gabor Somogyi created SPARK-33633: - Summary: Expose fine grained state information on SS UI Key: SPARK-33633 URL: https://issues.apache.org/jira/browse/SPARK-33633 Project: Spark Issue Type: Sub-task Components: Structured Streaming Affects Versions: 3.2.0 Reporter: Gabor Somogyi -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-33629) spark.buffer.size not applied in driver from pyspark
[ https://issues.apache.org/jira/browse/SPARK-33629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17242195#comment-17242195 ] Gabor Somogyi commented on SPARK-33629: --- I've started to work on this and going to file a PR soon. > spark.buffer.size not applied in driver from pyspark > > > Key: SPARK-33629 > URL: https://issues.apache.org/jira/browse/SPARK-33629 > Project: Spark > Issue Type: Bug > Components: PySpark >Affects Versions: 3.1.0 >Reporter: Gabor Somogyi >Priority: Major > > The problem has been discovered here: > [https://github.com/apache/spark/pull/30389#issuecomment-729524618] > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-33629) spark.buffer.size not applied in driver from pyspark
Gabor Somogyi created SPARK-33629: - Summary: spark.buffer.size not applied in driver from pyspark Key: SPARK-33629 URL: https://issues.apache.org/jira/browse/SPARK-33629 Project: Spark Issue Type: Bug Components: PySpark Affects Versions: 3.1.0 Reporter: Gabor Somogyi The problem has been discovered here: [https://github.com/apache/spark/pull/30389#issuecomment-729524618] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-32910) Remove UninterruptibleThread usage from KafkaOffsetReader
[ https://issues.apache.org/jira/browse/SPARK-32910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Somogyi updated SPARK-32910: -- Affects Version/s: (was: 3.1.0) 3.2.0 > Remove UninterruptibleThread usage from KafkaOffsetReader > - > > Key: SPARK-32910 > URL: https://issues.apache.org/jira/browse/SPARK-32910 > Project: Spark > Issue Type: Sub-task > Components: Structured Streaming >Affects Versions: 3.2.0 >Reporter: Gabor Somogyi >Priority: Major > > We've talked about this here: > https://github.com/apache/spark/pull/29729#discussion_r488690731 > This jira stands only if the mentioned PR is merged. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-32910) Remove UninterruptibleThread usage from KafkaOffsetReader
[ https://issues.apache.org/jira/browse/SPARK-32910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17241403#comment-17241403 ] Gabor Somogyi commented on SPARK-32910: --- I think there will be no time to put it into 3.1 so changing the target. > Remove UninterruptibleThread usage from KafkaOffsetReader > - > > Key: SPARK-32910 > URL: https://issues.apache.org/jira/browse/SPARK-32910 > Project: Spark > Issue Type: Sub-task > Components: Structured Streaming >Affects Versions: 3.1.0 >Reporter: Gabor Somogyi >Priority: Major > > We've talked about this here: > https://github.com/apache/spark/pull/29729#discussion_r488690731 > This jira stands only if the mentioned PR is merged. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-33491) Update Structured Streaming UI documentation page
Gabor Somogyi created SPARK-33491: - Summary: Update Structured Streaming UI documentation page Key: SPARK-33491 URL: https://issues.apache.org/jira/browse/SPARK-33491 Project: Spark Issue Type: Sub-task Components: Structured Streaming, Web UI Affects Versions: 3.1.0 Reporter: Gabor Somogyi -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-33143) Make SocketAuthServer socket timeout configurable
[ https://issues.apache.org/jira/browse/SPARK-33143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17232824#comment-17232824 ] Gabor Somogyi commented on SPARK-33143: --- [~mszurap] the OS and network guys are still working on it but one thing seems sure. It has nothing to do with the RDD size. It's reproducible w/ relatively small RDDs. > Make SocketAuthServer socket timeout configurable > - > > Key: SPARK-33143 > URL: https://issues.apache.org/jira/browse/SPARK-33143 > Project: Spark > Issue Type: Improvement > Components: PySpark, Spark Core >Affects Versions: 2.4.7, 3.0.1 >Reporter: Miklos Szurap >Priority: Major > > In SPARK-21551 the socket timeout for the Pyspark applications has been > increased from 3 to 15 seconds. However it is still hardcoded. > In certain situations even the 15 seconds is not enough, so it should be made > configurable. > This is requested after seeing it in real-life workload failures. > Also it has been suggested and requested in an earlier comment in > [SPARK-18649|https://issues.apache.org/jira/browse/SPARK-18649?focusedCommentId=16493498=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16493498] > In > Spark 2.4 it is under > [PythonRDD.scala|https://github.com/apache/spark/blob/branch-2.4/core/src/main/scala/org/apache/spark/api/python/PythonRDD.scala#L899] > in Spark 3.x the code has been moved to > [SocketAuthServer.scala|https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/security/SocketAuthServer.scala#L51] > {code} > serverSocket.setSoTimeout(15000) > {code} > Please include this in both 2.4 and 3.x branches. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-33143) Make SocketAuthServer socket timeout configurable
[ https://issues.apache.org/jira/browse/SPARK-33143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17231270#comment-17231270 ] Gabor Somogyi commented on SPARK-33143: --- [~hyukjin.kwon] thanks for the confirmation, started to craft a PR. As a separate thread will come back when we've found out the rootcause. > Make SocketAuthServer socket timeout configurable > - > > Key: SPARK-33143 > URL: https://issues.apache.org/jira/browse/SPARK-33143 > Project: Spark > Issue Type: Improvement > Components: PySpark, Spark Core >Affects Versions: 2.4.7, 3.0.1 >Reporter: Miklos Szurap >Priority: Major > > In SPARK-21551 the socket timeout for the Pyspark applications has been > increased from 3 to 15 seconds. However it is still hardcoded. > In certain situations even the 15 seconds is not enough, so it should be made > configurable. > This is requested after seeing it in real-life workload failures. > Also it has been suggested and requested in an earlier comment in > [SPARK-18649|https://issues.apache.org/jira/browse/SPARK-18649?focusedCommentId=16493498=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16493498] > In > Spark 2.4 it is under > [PythonRDD.scala|https://github.com/apache/spark/blob/branch-2.4/core/src/main/scala/org/apache/spark/api/python/PythonRDD.scala#L899] > in Spark 3.x the code has been moved to > [SocketAuthServer.scala|https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/security/SocketAuthServer.scala#L51] > {code} > serverSocket.setSoTimeout(15000) > {code} > Please include this in both 2.4 and 3.x branches. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-33143) Make SocketAuthServer socket timeout configurable
[ https://issues.apache.org/jira/browse/SPARK-33143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17231244#comment-17231244 ] Gabor Somogyi commented on SPARK-33143: --- I've had a look at this issue and under some circumstances at heavy users one of the following calls took more than 15 seconds ending up in timeout: * getaddrinfo * socket * settimeout It needs more investigation what's the rootcause of this issue (I'm on it). There are 2 suspects: * DNS is involved in getaddrinfo which is not responding or super slow * The OS is super slow somehow Either way is super hard to provide workaround with hardcoded timeout. I tend to believe it would be good to make it configurable otherwise such intermittent issues could make temporary workaround extremely hard. [~hyukjin.kwon] WDYT? > Make SocketAuthServer socket timeout configurable > - > > Key: SPARK-33143 > URL: https://issues.apache.org/jira/browse/SPARK-33143 > Project: Spark > Issue Type: Improvement > Components: PySpark, Spark Core >Affects Versions: 2.4.7, 3.0.1 >Reporter: Miklos Szurap >Priority: Major > > In SPARK-21551 the socket timeout for the Pyspark applications has been > increased from 3 to 15 seconds. However it is still hardcoded. > In certain situations even the 15 seconds is not enough, so it should be made > configurable. > This is requested after seeing it in real-life workload failures. > Also it has been suggested and requested in an earlier comment in > [SPARK-18649|https://issues.apache.org/jira/browse/SPARK-18649?focusedCommentId=16493498=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16493498] > In > Spark 2.4 it is under > [PythonRDD.scala|https://github.com/apache/spark/blob/branch-2.4/core/src/main/scala/org/apache/spark/api/python/PythonRDD.scala#L899] > in Spark 3.x the code has been moved to > [SocketAuthServer.scala|https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/security/SocketAuthServer.scala#L51] > {code} > serverSocket.setSoTimeout(15000) > {code} > Please include this in both 2.4 and 3.x branches. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-33287) Expose state custom metrics information on SS UI
[ https://issues.apache.org/jira/browse/SPARK-33287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Somogyi updated SPARK-33287: -- Affects Version/s: (was: 3.1.0) 3.0.1 > Expose state custom metrics information on SS UI > > > Key: SPARK-33287 > URL: https://issues.apache.org/jira/browse/SPARK-33287 > Project: Spark > Issue Type: Sub-task > Components: Structured Streaming, Web UI >Affects Versions: 3.0.1 >Reporter: Gabor Somogyi >Priority: Major > > Since not all custom metrics hold useful information it would be good to add > exclude possibility. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-33273) Fix Flaky Test: ThriftServerQueryTestSuite. subquery_scalar_subquery_scalar_subquery_select_sql
[ https://issues.apache.org/jira/browse/SPARK-33273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17222896#comment-17222896 ] Gabor Somogyi commented on SPARK-33273: --- I've just faced with this too: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130409/testReport/org.apache.spark.sql.hive.thriftserver/ThriftServerQueryTestSuite/subquery_scalar_subquery_scalar_subquery_select_sql/ > Fix Flaky Test: ThriftServerQueryTestSuite. > subquery_scalar_subquery_scalar_subquery_select_sql > --- > > Key: SPARK-33273 > URL: https://issues.apache.org/jira/browse/SPARK-33273 > Project: Spark > Issue Type: Test > Components: Tests >Affects Versions: 3.1.0 >Reporter: Dongjoon Hyun >Priority: Major > > - > https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130369/testReport/org.apache.spark.sql.hive.thriftserver/ThriftServerQueryTestSuite/subquery_scalar_subquery_scalar_subquery_select_sql/ > {code} > [info] - subquery/scalar-subquery/scalar-subquery-select.sql *** FAILED *** > (3 seconds, 877 milliseconds) > [info] Expected "[1]0 2017-05-04 01:01:0...", but got "[]0 > 2017-05-04 01:01:0..." Result did not match for query #3 > [info] SELECT (SELECT min(t3d) FROM t3) min_t3d, > [info] (SELECT max(t2h) FROM t2) max_t2h > [info] FROM t1 > [info] WHERE t1a = 'val1c' (ThriftServerQueryTestSuite.scala:197) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-33287) Expose state custom metrics information on SS UI
[ https://issues.apache.org/jira/browse/SPARK-33287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Somogyi updated SPARK-33287: -- Description: Since not all custom metrics hold useful information it would be good to add exclude possibility. > Expose state custom metrics information on SS UI > > > Key: SPARK-33287 > URL: https://issues.apache.org/jira/browse/SPARK-33287 > Project: Spark > Issue Type: Sub-task > Components: Structured Streaming, Web UI >Affects Versions: 3.1.0 >Reporter: Gabor Somogyi >Priority: Major > > Since not all custom metrics hold useful information it would be good to add > exclude possibility. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-33287) Expose state custom metrics information on SS UI
Gabor Somogyi created SPARK-33287: - Summary: Expose state custom metrics information on SS UI Key: SPARK-33287 URL: https://issues.apache.org/jira/browse/SPARK-33287 Project: Spark Issue Type: Sub-task Components: Structured Streaming, Web UI Affects Versions: 3.1.0 Reporter: Gabor Somogyi -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-33222) Expose missing information (graphs) on SS UI
[ https://issues.apache.org/jira/browse/SPARK-33222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17218974#comment-17218974 ] Gabor Somogyi commented on SPARK-33222: --- I've started to implement it. > Expose missing information (graphs) on SS UI > > > Key: SPARK-33222 > URL: https://issues.apache.org/jira/browse/SPARK-33222 > Project: Spark > Issue Type: Improvement > Components: Structured Streaming, Web UI >Affects Versions: 3.0.1 >Reporter: Gabor Somogyi >Priority: Major > > There are couple of things which not yet shown on Structured Streaming UI. > I'm creating subtasks to add them. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-33224) Expose watermark information on SS UI
Gabor Somogyi created SPARK-33224: - Summary: Expose watermark information on SS UI Key: SPARK-33224 URL: https://issues.apache.org/jira/browse/SPARK-33224 Project: Spark Issue Type: Sub-task Components: Structured Streaming, Web UI Affects Versions: 3.0.1 Reporter: Gabor Somogyi -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-33223) Expose state information on SS UI
Gabor Somogyi created SPARK-33223: - Summary: Expose state information on SS UI Key: SPARK-33223 URL: https://issues.apache.org/jira/browse/SPARK-33223 Project: Spark Issue Type: Sub-task Components: Structured Streaming, Web UI Affects Versions: 3.0.1 Reporter: Gabor Somogyi -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-33222) Expose missing information (graphs) on SS UI
Gabor Somogyi created SPARK-33222: - Summary: Expose missing information (graphs) on SS UI Key: SPARK-33222 URL: https://issues.apache.org/jira/browse/SPARK-33222 Project: Spark Issue Type: Improvement Components: Structured Streaming, Web UI Affects Versions: 3.0.1 Reporter: Gabor Somogyi There are couple of things which not yet shown on Structured Streaming UI. I'm creating subtasks to add them. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-25547) Pluggable jdbc connection factory
[ https://issues.apache.org/jira/browse/SPARK-25547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17213152#comment-17213152 ] Gabor Somogyi commented on SPARK-25547: --- [~fsauer65] JDBC connection provider API is added here: https://github.com/apache/spark/blob/dc697a8b598aea922ee6620d87f3ace2f7947231/sql/core/src/main/scala/org/apache/spark/sql/jdbc/JdbcConnectionProvider.scala#L36 Do you think we can close this jira? > Pluggable jdbc connection factory > - > > Key: SPARK-25547 > URL: https://issues.apache.org/jira/browse/SPARK-25547 > Project: Spark > Issue Type: Improvement > Components: Spark Core >Affects Versions: 2.3.1 >Reporter: Frank Sauer >Priority: Major > > The ability to provide a custom connectionFactoryProvider via JDBCOptions so > that JdbcUtils.createConnectionFactory can produce a custom connection > factory would be very useful. In our case we needed to have the ability to > load balance connections to an AWS Aurora Postgres cluster by round-robining > through the endpoints of the read replicas since their own loan balancing was > insufficient. We got away with it by copying most of the spark jdbc package > and provide this feature there and changing the format from jdbc to our new > package. However it would be nice if this were supported out of the box via > a new option in JDBCOptions providing the classname for a > ConnectionFactoryProvider. I'm creating this Jira in order to submit a PR > which I have ready to go. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-32229) Application entry parsing fails because DriverWrapper registered instead of the normal driver
[ https://issues.apache.org/jira/browse/SPARK-32229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17212327#comment-17212327 ] Gabor Somogyi commented on SPARK-32229: --- Started to work on this. > Application entry parsing fails because DriverWrapper registered instead of > the normal driver > - > > Key: SPARK-32229 > URL: https://issues.apache.org/jira/browse/SPARK-32229 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.1.0 >Reporter: Gabor Somogyi >Priority: Major > > In some cases DriverWrapper registered by DriverRegistry which causes > exception in PostgresConnectionProvider: > https://github.com/apache/spark/blob/371b35d2e0ab08ebd853147c6673de3adfad0553/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/DriverRegistry.scala#L53 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-33102) Use stringToSeq on SQL list typed parameters
[ https://issues.apache.org/jira/browse/SPARK-33102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1723#comment-1723 ] Gabor Somogyi commented on SPARK-33102: --- Filing a PR soon... > Use stringToSeq on SQL list typed parameters > > > Key: SPARK-33102 > URL: https://issues.apache.org/jira/browse/SPARK-33102 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.1.0 >Reporter: Gabor Somogyi >Priority: Minor > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-33102) Use stringToSeq on SQL list typed parameters
Gabor Somogyi created SPARK-33102: - Summary: Use stringToSeq on SQL list typed parameters Key: SPARK-33102 URL: https://issues.apache.org/jira/browse/SPARK-33102 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 3.1.0 Reporter: Gabor Somogyi -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-32047) Add provider disable possibility just like in delegation token provider
[ https://issues.apache.org/jira/browse/SPARK-32047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17206208#comment-17206208 ] Gabor Somogyi commented on SPARK-32047: --- I'm intended to file a PR next week... > Add provider disable possibility just like in delegation token provider > --- > > Key: SPARK-32047 > URL: https://issues.apache.org/jira/browse/SPARK-32047 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.1.0 >Reporter: Gabor Somogyi >Priority: Major > > There is an enable flag in delegation provider area > "spark.security.credentials.%s.enabled". > It would be good to add similar to the JDBC secure connection provider area > because this would make embedded providers interchangeable (embedded can be > turned off and another provider w/ a different name can be registered). This > make sense only if we create API for the secure JDBC connection provider. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-28367) Kafka connector infinite wait because metadata never updated
[ https://issues.apache.org/jira/browse/SPARK-28367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Somogyi updated SPARK-28367: -- Description: Spark uses an old and deprecated API named poll(long) which never returns and stays in live lock if metadata is not updated (for instance when broker disappears at consumer creation). was: Spark uses an old and deprecated API named poll(long) which never returns and stays in live lock if metadata is not updated (for instance when broker disappears at consumer creation). I've created a small standalone application to test it and the alternatives: https://github.com/gaborgsomogyi/kafka-get-assignment > Kafka connector infinite wait because metadata never updated > > > Key: SPARK-28367 > URL: https://issues.apache.org/jira/browse/SPARK-28367 > Project: Spark > Issue Type: Bug > Components: Structured Streaming >Affects Versions: 2.1.3, 2.2.3, 2.3.3, 2.4.3, 3.0.0, 3.1.0 >Reporter: Gabor Somogyi >Priority: Critical > > Spark uses an old and deprecated API named poll(long) which never returns and > stays in live lock if metadata is not updated (for instance when broker > disappears at consumer creation). -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-32910) Remove UninterruptibleThread usage from KafkaOffsetReader
[ https://issues.apache.org/jira/browse/SPARK-32910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17197617#comment-17197617 ] Gabor Somogyi commented on SPARK-32910: --- Started to work on this. > Remove UninterruptibleThread usage from KafkaOffsetReader > - > > Key: SPARK-32910 > URL: https://issues.apache.org/jira/browse/SPARK-32910 > Project: Spark > Issue Type: Sub-task > Components: Structured Streaming >Affects Versions: 3.1.0 >Reporter: Gabor Somogyi >Priority: Major > > We've talked about this here: > https://github.com/apache/spark/pull/29729#discussion_r488690731 > This jira stands only if the mentioned PR is merged. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-32910) Remove UninterruptibleThread usage from KafkaOffsetReader
Gabor Somogyi created SPARK-32910: - Summary: Remove UninterruptibleThread usage from KafkaOffsetReader Key: SPARK-32910 URL: https://issues.apache.org/jira/browse/SPARK-32910 Project: Spark Issue Type: Sub-task Components: Structured Streaming Affects Versions: 3.1.0 Reporter: Gabor Somogyi We've talked about this here: https://github.com/apache/spark/pull/29729#discussion_r488690731 This jira stands only if the mentioned PR is merged. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-32032) Eliminate deprecated poll(long) API calls to avoid infinite wait in driver
[ https://issues.apache.org/jira/browse/SPARK-32032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17194269#comment-17194269 ] Gabor Somogyi commented on SPARK-32032: --- [~Bartalos] query is not progressing. I've just finished the PR preparation and going to file a PR soon... > Eliminate deprecated poll(long) API calls to avoid infinite wait in driver > -- > > Key: SPARK-32032 > URL: https://issues.apache.org/jira/browse/SPARK-32032 > Project: Spark > Issue Type: Sub-task > Components: Structured Streaming >Affects Versions: 3.1.0 >Reporter: Gabor Somogyi >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-32032) Eliminate deprecated poll(long) API calls to avoid infinite wait in driver
[ https://issues.apache.org/jira/browse/SPARK-32032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17191999#comment-17191999 ] Gabor Somogyi commented on SPARK-32032: --- As a temporary result of the effort I've created a document where I've summarised everything: https://docs.google.com/document/d/1gAh0pKgZUgyqO2Re3sAy-fdYpe_SxpJ6DkeXE8R1P7E/edit?usp=sharing Feel free to comment and help the effort to fix this nasty issue. [~zsxwing], I'm pretty sure you're interested in the details. This change touching key parts of the Kafka connector. > Eliminate deprecated poll(long) API calls to avoid infinite wait in driver > -- > > Key: SPARK-32032 > URL: https://issues.apache.org/jira/browse/SPARK-32032 > Project: Spark > Issue Type: Sub-task > Components: Structured Streaming >Affects Versions: 3.1.0 >Reporter: Gabor Somogyi >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-32520) Flaky Test: KafkaSourceStressSuite.stress test with multiple topics and partitions
[ https://issues.apache.org/jira/browse/SPARK-32520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17170073#comment-17170073 ] Gabor Somogyi commented on SPARK-32520: --- Commented on https://issues.apache.org/jira/browse/SPARK-32519. If it's comes often and Jungtaek can't take a look then I suggest a rollback and I can pick it up 3 weeks later (vacation). > Flaky Test: KafkaSourceStressSuite.stress test with multiple topics and > partitions > -- > > Key: SPARK-32520 > URL: https://issues.apache.org/jira/browse/SPARK-32520 > Project: Spark > Issue Type: Sub-task > Components: Structured Streaming, Tests >Affects Versions: 3.1.0 >Reporter: Hyukjin Kwon >Priority: Major > > {{KafkaSourceStressSuite.stress test with multiple topics and partitions}} > seems flaky in GitHub Actions build: > https://github.com/apache/spark/pull/29335/checks?check_run_id=940205463 > {code} > KafkaSourceStressSuite: > - stress test with multiple topics and partitions *** FAILED *** (2 minutes, > 7 seconds) > Timed out waiting for stream: The code passed to failAfter did not complete > within 30 seconds. > java.lang.Thread.getStackTrace(Thread.java:1559) > org.scalatest.concurrent.TimeLimits.failAfterImpl(TimeLimits.scala:234) > org.scalatest.concurrent.TimeLimits.failAfterImpl$(TimeLimits.scala:233) > > org.apache.spark.sql.kafka010.KafkaSourceTest.failAfterImpl(KafkaMicroBatchSourceSuite.scala:53) > org.scalatest.concurrent.TimeLimits.failAfter(TimeLimits.scala:230) > org.scalatest.concurrent.TimeLimits.failAfter$(TimeLimits.scala:229) > > org.apache.spark.sql.kafka010.KafkaSourceTest.failAfter(KafkaMicroBatchSourceSuite.scala:53) > > org.apache.spark.sql.streaming.StreamTest.$anonfun$testStream$7(StreamTest.scala:471) > > org.apache.spark.sql.streaming.StreamTest.$anonfun$testStream$7$adapted(StreamTest.scala:470) > scala.collection.mutable.HashMap.$anonfun$foreach$1(HashMap.scala:149) > Caused by: null > > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2156) > > org.apache.spark.sql.execution.streaming.StreamExecution.awaitOffset(StreamExecution.scala:483) > > org.apache.spark.sql.streaming.StreamTest.$anonfun$testStream$8(StreamTest.scala:472) > > scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23) > > org.scalatest.enablers.Timed$$anon$1.timeoutAfter(Timed.scala:127) > > org.scalatest.concurrent.TimeLimits.failAfterImpl(TimeLimits.scala:239) > > org.scalatest.concurrent.TimeLimits.failAfterImpl$(TimeLimits.scala:233) > > org.apache.spark.sql.kafka010.KafkaSourceTest.failAfterImpl(KafkaMicroBatchSourceSuite.scala:53) > > org.scalatest.concurrent.TimeLimits.failAfter(TimeLimits.scala:230) > > org.scalatest.concurrent.TimeLimits.failAfter$(TimeLimits.scala:229) > == Progress == > AssertOnQuery(, ) > AddKafkaData(topics = Set(stress4, stress2, stress1, stress5, stress3), > data = empty Range 0 until 0, message = ) > CheckAnswer: > StopStream > AddKafkaData(topics = Set(stress4, stress2, stress1, stress5, stress3), > data = Range 0 until 8, message = ) > > StartStream(ProcessingTimeTrigger(0),org.apache.spark.util.SystemClock@7dce5824,Map(),null) > CheckAnswer: [1],[2],[3],[4],[5],[6],[7],[8] > CheckAnswer: [1],[2],[3],[4],[5],[6],[7],[8] > StopStream > > StartStream(ProcessingTimeTrigger(0),org.apache.spark.util.SystemClock@7255955e,Map(),null) > AddKafkaData(topics = Set(stress4, stress6, stress2, stress1, stress5, > stress3), data = Range 8 until 9, message = Add topic stress7) > AddKafkaData(topics = Set(stress4, stress6, stress2, stress1, stress5, > stress3), data = Range 9 until 10, message = ) > AddKafkaData(topics = Set(stress4, stress6, stress2, stress1, stress5, > stress3), data = Range 10 until 15, message = Add partition) > AddKafkaData(topics = Set(stress4, stress6, stress2, stress8, stress1, > stress5, stress3), data = empty Range 15 until 15, message = Add topic > stress9) > AddKafkaData(topics = Set(stress4, stress6, stress2, stress8, stress1, > stress3), data = Range 15 until 16, message = Delete topic stress5) > AddKafkaData(topics = Set(stress4, stress6, stress2, stress8, stress1, > stress3, stress10), data = Range 16 until 23, message = Add topic stress11) > CheckAnswer: > [1],[2],[3],[4],[5],[6],[7],[8],[9],[10],[11],[12],[13],[14],[15],[16],[17],[18],[19],[20],[21],[22],[23] > StopStream > AddKafkaData(topics = Set(stress4, stress6, stress2, stress8, stress1, >
[jira] [Commented] (SPARK-32519) test of org.apache.spark.sql.kafka010.KafkaSourceStressSuite failed for aarch64
[ https://issues.apache.org/jira/browse/SPARK-32519?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17169850#comment-17169850 ] Gabor Somogyi commented on SPARK-32519: --- I'm on 3 weeks vacation but had a slight look at it. This 30 seconds means the timeout is in place: {code:java} The code passed to failAfter did not complete within 30 seconds {code} Kafka is known to be flaky in general which is the case here as well. I've seen this issue before so I don't think the change itself caused it. It may happen that the change itself made this more frequent but I hardly can believe that this is the root cause. As a side note timeout is timeout on all platforms but if I understand it well this happens only on aarch64, right? cc [~kabhwan] > test of org.apache.spark.sql.kafka010.KafkaSourceStressSuite failed for > aarch64 > --- > > Key: SPARK-32519 > URL: https://issues.apache.org/jira/browse/SPARK-32519 > Project: Spark > Issue Type: Test > Components: Tests >Affects Versions: 3.0.0 >Reporter: huangtianhua >Priority: Major > > The aarch64 maven job failed after the commit > https://github.com/apache/spark/commit/813532d10310027fee9e12680792cee2e1c2b7c7 >merged, see the log > https://amplab.cs.berkeley.edu/jenkins/job/spark-master-test-maven-arm/353/testReport/junit/org.apache.spark.sql.kafka010/KafkaSourceStressSuite/stress_test_with_multiple_topics_and_partitions/ > I took test in my aarch64 instance, if I reset the commit > 813532d10310027fee9e12680792cee2e1c2b7c7 the test is ok. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-32032) Eliminate deprecated poll(long) API calls to avoid infinite wait in driver
[ https://issues.apache.org/jira/browse/SPARK-32032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17167214#comment-17167214 ] Gabor Somogyi commented on SPARK-32032: --- I'm working on the solution. > Eliminate deprecated poll(long) API calls to avoid infinite wait in driver > -- > > Key: SPARK-32032 > URL: https://issues.apache.org/jira/browse/SPARK-32032 > Project: Spark > Issue Type: Sub-task > Components: Structured Streaming >Affects Versions: 3.1.0 >Reporter: Gabor Somogyi >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-32032) Eliminate deprecated poll(long) API calls to avoid infinite wait in driver
[ https://issues.apache.org/jira/browse/SPARK-32032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17167213#comment-17167213 ] Gabor Somogyi commented on SPARK-32032: --- I've renamed the jira because the solution is not simply use a different API. > Eliminate deprecated poll(long) API calls to avoid infinite wait in driver > -- > > Key: SPARK-32032 > URL: https://issues.apache.org/jira/browse/SPARK-32032 > Project: Spark > Issue Type: Sub-task > Components: Structured Streaming >Affects Versions: 3.1.0 >Reporter: Gabor Somogyi >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-32032) Eliminate deprecated poll(long) API calls to avoid infinite wait in driver
[ https://issues.apache.org/jira/browse/SPARK-32032?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Somogyi updated SPARK-32032: -- Summary: Eliminate deprecated poll(long) API calls to avoid infinite wait in driver (was: Use new poll API in Kafka connector diver side to avoid infinite wait) > Eliminate deprecated poll(long) API calls to avoid infinite wait in driver > -- > > Key: SPARK-32032 > URL: https://issues.apache.org/jira/browse/SPARK-32032 > Project: Spark > Issue Type: Sub-task > Components: Structured Streaming >Affects Versions: 3.1.0 >Reporter: Gabor Somogyi >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-32482) Eliminate deprecated poll(long) API calls to avoid infinite wait in tests
Gabor Somogyi created SPARK-32482: - Summary: Eliminate deprecated poll(long) API calls to avoid infinite wait in tests Key: SPARK-32482 URL: https://issues.apache.org/jira/browse/SPARK-32482 Project: Spark Issue Type: Sub-task Components: Structured Streaming, Tests Affects Versions: 3.1.0 Reporter: Gabor Somogyi -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-32468) Fix timeout config issue in Kafka connector tests
Gabor Somogyi created SPARK-32468: - Summary: Fix timeout config issue in Kafka connector tests Key: SPARK-32468 URL: https://issues.apache.org/jira/browse/SPARK-32468 Project: Spark Issue Type: Bug Components: Structured Streaming, Tests Affects Versions: 3.1.0 Reporter: Gabor Somogyi While I'm implementing SPARK-32032 I've found a bug in Kafka: https://issues.apache.org/jira/browse/KAFKA-10318. This will cause issues only later when it's fixed but it would be good to fix it now because SPARK-32032 would like to bring in AdminClient where the code blows up with the mentioned ConfigException. This would reduce the code changes in the mentioned jira. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org