Re: Spark-Druid Connectors
Hey Julian, Your pessimism in this matter is understandable but regrettable! It would be great to see this effort become part of mainline Druid. It is a more maintainable approach than a separate repo, because it gets rid of the risk of interface drift, and it makes sure that all the tests are run whenever we do a Druid release. It's more upfront work for you (and for us), but Spark and Druid are both important OSS projects and I think it is good to encourage better integration between them. I have also written in the past about the importance of us getting better at accepting contributions (at https://s.apache.org/aqicd). It is not always easy, since reviewing contributions takes time, and it is mostly done on a volunteer basis. But I think if you are game to work with us on this one, let's try to get it in. I say that out of pure idealism, not having looked at the design or code at all In the mail I linked, I had written: > For contributors, focusing on UX and tests means writing out (in natural > language) how your patch changes user experience, and why you think this > change is a good idea. It also means having good testing of the new stuff > you're adding, and writing out (in natural language) why you think your > tests cover all the important cases. Speaking as a person that has reviewed > a lot of code: these natural language descriptions are *very helpful*, > especially when they add context to the patch. Don't make reviewers > reverse-engineer your code to guess what you were thinking. As I said, I haven't looked at your design doc or PR yet. But if they cover the above stuff, could you please point me to the right places that have the most up-to-date info, and I will put my money where my mouth is and review them in the way that I suggested in that thread. (i.e., focusing on user experience and test coverage.) By the way, I think the mailing list chomped your links. I'll reproduce them here. 1) Mailing list: https://lists.apache.org/thread.html/r8219a7be0583ae3d9a2303fa7f21872782cf0703812a410bb62acfef%40%3Cdev.druid.apache.org%3E 2) Slack: https://the-asf.slack.com/archives/CJ8D1JTB8/p1581452302483600 3) GitHub: https://github.com/apache/druid/issues/9780 4) Pull request: https://github.com/apache/druid/pull/10920 On Tue, Feb 23, 2021 at 10:37 PM Julian Jaffe wrote: > > Hey Druids, > > Last April, there was some discussion on this mailing list, Slack, and > GitHub around building Spark-Druid connectors. After working up a rough > cut, the effort was dormant until a few weeks ago when I returned to it. > I’ve opened a pull request for the connectors, but I don’t realistically > expect it to be accepted. Am I too pessimistic in my assumptions here? > Otherwise, what’s the best course of action - create a standalone repo and > add a link in the Druid docs? > > Julian >
Spark-Druid Connectors
Hey Druids, Last April, there was some discussion on this mailing list, Slack, and GitHub around building Spark-Druid connectors. After working up a rough cut, the effort was dormant until a few weeks ago when I returned to it. I’ve opened a pull request for the connectors, but I don’t realistically expect it to be accepted. Am I too pessimistic in my assumptions here? Otherwise, what’s the best course of action - create a standalone repo and add a link in the Druid docs? Julian
Re: Help wanted when adding integration tests
I don't have all the answers, but there are helpful notes in the README in the integration-tests sub forlder. Some responses to your questions inline. I think all the sections under https://github.com/apache/druid/tree/master/integration-tests#tips--tricks-for-debugging-and-developing-integration-tests will be useful. On Tue, Feb 23, 2021 at 3:15 AM Chen Frank wrote: > Hi all, > > I am fixing a Druid bug and is adding some integration test, but is > blocked by some integration test problems which I did not find any solution > from existing doc. > Since I do not have enough time to investigate these problems deeply, I am > writing this email for the purpose of hoping any of you could offer me some > help or suggestions. > The command to launch IT test I used is: mvn verify -P integration-tests > -Dit.test=ITIndexerTest -Ddocker.build.skip=true > > > l Problem 1. Druid nodes started in docker by integration test start very > slow. It took about 2-3 minutes until the nodes were ready to accept http > request from test. Following coordinator log shows that it took very long > during first 3 log lines. Other nodes were also the same. > > Listening for transport dt_socket at address: 5006 > 2021-02-23T10:28:22,874 INFO [main] > org.hibernate.validator.internal.util.Version - HV01: Hibernate > Validator 5.2.5.Final > 2021-02-23T10:29:19,442 WARN [main] > org.apache.hadoop.util.NativeCodeLoader - Unable to load native-hadoop > library for your platform... using builtin-java classes where applicable > 2021-02-23T10:29:42,797 INFO [main] org.apache.curator.utils.Compatibility > - Running in ZooKeeper 3.4.x compatibility mode > 2021-02-23T10:29:42,839 INFO [main] org.apache.curator.utils.Compatibility > - Using emulated InjectSessionExpiration > 2021-02-23T10:29:47,396 INFO [main] > org.apache.druid.server.emitter.EmitterModule - Using emitter > [NoopEmitter{}] for metrics and alerts, with dimensions > [{version=0.21.0-SNAPSHOT}]. > > This problem greatly causes long time to wait for execution of a test. The > problem also exists when the nodes are started by manually executing > docker-compose -f docker-compose.yml command. > > > l Problem 2. There are lots of stack trace messages on the console > showing that “Unable to canonicalize address 127.0.0.1/:2181 > because it's not resolvable” > > 2021-02-23T10:31:43,722 WARN [main-SendThread(127.0.0.1:2181)] > org.apache.zookeeper.ClientCnxn - Session 0x0 for server > 127.0.0.1/:2181, > unexpected error, closing socket connection and attempting reconnect > java.lang.IllegalArgumentException: Unable to canonicalize address > 127.0.0.1/:2181 because it's not resolvable > at > org.apache.zookeeper.SaslServerPrincipal.getServerPrincipal(SaslServerPrincipal.java:65) > ~[zookeeper-3.4.14.jar:3.4.14-4c25d480e66aadd371de8bd2fd8da255ac140bcf] > at > org.apache.zookeeper.SaslServerPrincipal.getServerPrincipal(SaslServerPrincipal.java:41) > ~[zookeeper-3.4.14.jar:3.4.14-4c25d480e66aadd371de8bd2fd8da255ac140bcf] > at > org.apache.zookeeper.ClientCnxn$SendThread.startConnect(ClientCnxn.java:1001) > ~[zookeeper-3.4.14.jar:3.4.14-4c25d480e66aadd371de8bd2fd8da255ac140bcf] > at > org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1060) > [zookeeper-3.4.14.jar:3.4.14-4c25d480e66aadd371de8bd2fd8da255ac140bcf] > > What happened and how to solve it so that the console is clean enough for > me to find the real exception thrown by IT test ? > > > l Problem 3. Since my test failed, I wanted to check task logs to dig out > what happed. But I did not find task logs in docker so I tried to visit > router at http://localhost: to see if task log could be found in > 'Ingestion Tab' , but router requires username and password to > authenticate. So, what is the username and password to access router web > page ? > > https://github.com/apache/druid/blob/master/integration-tests/docker/environment-configs/common#L32 - This is described in the environment setup for the integration tests. Could you update the README with instructions for how to find this type of information. > > l Problem 4. How to starts and debug an integration test in IntelliJ ? I > tried, but it showed in console saying > This should help you with instructions on how to debug an integration test https://github.com/apache/druid/tree/master/integration-tests#debugging-druid-while-running-tests > > > com.google.inject.ProvisionException: Unable to provision, see the > following errors: > > 1) Error in custom provider, java.lang.NullPointerException: must > specify a trustStorePath > > It seems that some configurations are missing to start the test in > IntelliJ. > > > Any help or response from you is greatly appreciated. > > >
Help wanted when adding integration tests
Hi all, I am fixing a Druid bug and is adding some integration test, but is blocked by some integration test problems which I did not find any solution from existing doc. Since I do not have enough time to investigate these problems deeply, I am writing this email for the purpose of hoping any of you could offer me some help or suggestions. The command to launch IT test I used is: mvn verify -P integration-tests -Dit.test=ITIndexerTest -Ddocker.build.skip=true l Problem 1. Druid nodes started in docker by integration test start very slow. It took about 2-3 minutes until the nodes were ready to accept http request from test. Following coordinator log shows that it took very long during first 3 log lines. Other nodes were also the same. Listening for transport dt_socket at address: 5006 2021-02-23T10:28:22,874 INFO [main] org.hibernate.validator.internal.util.Version - HV01: Hibernate Validator 5.2.5.Final 2021-02-23T10:29:19,442 WARN [main] org.apache.hadoop.util.NativeCodeLoader - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 2021-02-23T10:29:42,797 INFO [main] org.apache.curator.utils.Compatibility - Running in ZooKeeper 3.4.x compatibility mode 2021-02-23T10:29:42,839 INFO [main] org.apache.curator.utils.Compatibility - Using emulated InjectSessionExpiration 2021-02-23T10:29:47,396 INFO [main] org.apache.druid.server.emitter.EmitterModule - Using emitter [NoopEmitter{}] for metrics and alerts, with dimensions [{version=0.21.0-SNAPSHOT}]. This problem greatly causes long time to wait for execution of a test. The problem also exists when the nodes are started by manually executing docker-compose -f docker-compose.yml command. l Problem 2. There are lots of stack trace messages on the console showing that “Unable to canonicalize address 127.0.0.1/:2181 because it's not resolvable” 2021-02-23T10:31:43,722 WARN [main-SendThread(127.0.0.1:2181)] org.apache.zookeeper.ClientCnxn - Session 0x0 for server 127.0.0.1/:2181, unexpected error, closing socket connection and attempting reconnect java.lang.IllegalArgumentException: Unable to canonicalize address 127.0.0.1/:2181 because it's not resolvable at org.apache.zookeeper.SaslServerPrincipal.getServerPrincipal(SaslServerPrincipal.java:65) ~[zookeeper-3.4.14.jar:3.4.14-4c25d480e66aadd371de8bd2fd8da255ac140bcf] at org.apache.zookeeper.SaslServerPrincipal.getServerPrincipal(SaslServerPrincipal.java:41) ~[zookeeper-3.4.14.jar:3.4.14-4c25d480e66aadd371de8bd2fd8da255ac140bcf] at org.apache.zookeeper.ClientCnxn$SendThread.startConnect(ClientCnxn.java:1001) ~[zookeeper-3.4.14.jar:3.4.14-4c25d480e66aadd371de8bd2fd8da255ac140bcf] at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1060) [zookeeper-3.4.14.jar:3.4.14-4c25d480e66aadd371de8bd2fd8da255ac140bcf] What happened and how to solve it so that the console is clean enough for me to find the real exception thrown by IT test ? l Problem 3. Since my test failed, I wanted to check task logs to dig out what happed. But I did not find task logs in docker so I tried to visit router at http://localhost: to see if task log could be found in 'Ingestion Tab' , but router requires username and password to authenticate. So, what is the username and password to access router web page ? l Problem 4. How to starts and debug an integration test in IntelliJ ? I tried, but it showed in console saying com.google.inject.ProvisionException: Unable to provision, see the following errors: 1) Error in custom provider, java.lang.NullPointerException: must specify a trustStorePath It seems that some configurations are missing to start the test in IntelliJ. Any help or response from you is greatly appreciated.