Re: Spark-Druid Connectors

2021-02-23 Thread Gian Merlino
Hey Julian,

Your pessimism in this matter is understandable but regrettable!

It would be great to see this effort become part of mainline Druid. It is a
more maintainable approach than a separate repo, because it gets rid of the
risk of interface drift, and it makes sure that all the tests are run
whenever we do a Druid release. It's more upfront work for you (and for
us), but Spark and Druid are both important OSS projects and I think it is
good to encourage better integration between them. I have also written in
the past about the importance of us getting better at accepting
contributions (at https://s.apache.org/aqicd). It is not always easy, since
reviewing contributions takes time, and it is mostly done on a volunteer
basis. But I think if you are game to work with us on this one, let's try
to get it in. I say that out of pure idealism, not having looked at the
design or code at all 

In the mail I linked, I had written:

> For contributors, focusing on UX and tests means writing out (in natural
> language) how your patch changes user experience, and why you think this
> change is a good idea. It also means having good testing of the new stuff
> you're adding, and writing out (in natural language) why you think your
> tests cover all the important cases. Speaking as a person that has
reviewed
> a lot of code: these natural language descriptions are *very helpful*,
> especially when they add context to the patch. Don't make reviewers
> reverse-engineer your code to guess what you were thinking.

As I said, I haven't looked at your design doc or PR yet. But if they cover
the above stuff, could you please point me to the right places that have
the most up-to-date info, and I will put my money where my mouth is and
review them in the way that I suggested in that thread. (i.e., focusing on
user experience and test coverage.)

By the way, I think the mailing list chomped your links. I'll reproduce
them here.

1) Mailing list:
https://lists.apache.org/thread.html/r8219a7be0583ae3d9a2303fa7f21872782cf0703812a410bb62acfef%40%3Cdev.druid.apache.org%3E
2) Slack: https://the-asf.slack.com/archives/CJ8D1JTB8/p1581452302483600
3) GitHub: https://github.com/apache/druid/issues/9780
4) Pull request: https://github.com/apache/druid/pull/10920

On Tue, Feb 23, 2021 at 10:37 PM Julian Jaffe 
wrote:

>
> Hey Druids,
>
> Last April, there was some discussion on this mailing list, Slack, and
> GitHub around building Spark-Druid connectors. After working up a rough
> cut, the effort was dormant until a few weeks ago when I returned to it.
> I’ve opened a pull request for the connectors, but I don’t realistically
> expect it to be accepted. Am I too pessimistic in my assumptions here?
> Otherwise, what’s the best course of action - create a standalone repo and
> add a link in the Druid docs?
>
> Julian
>


Spark-Druid Connectors

2021-02-23 Thread Julian Jaffe

Hey Druids,

Last April, there was some discussion on this mailing list, Slack, and GitHub 
around building Spark-Druid connectors. After working up a rough cut, the 
effort was dormant until a few weeks ago when I returned to it. I’ve opened a 
pull request for the connectors, but I don’t realistically expect it to be 
accepted. Am I too pessimistic in my assumptions here? Otherwise, what’s the 
best course of action - create a standalone repo and add a link in the Druid 
docs?

Julian


Re: Help wanted when adding integration tests

2021-02-23 Thread Suneet Saldanha
I don't have all the answers, but there are helpful notes in the README in
the integration-tests sub forlder. Some responses to your questions inline.
I think all the sections under
https://github.com/apache/druid/tree/master/integration-tests#tips--tricks-for-debugging-and-developing-integration-tests
will be useful.

On Tue, Feb 23, 2021 at 3:15 AM Chen Frank 
wrote:

> Hi all,
>
>   I am fixing a Druid bug and is adding some integration test, but is
> blocked by some integration test problems which I did not find any solution
> from existing doc.
> Since I do not have enough time to investigate these problems deeply, I am
> writing this email for the purpose of hoping any of you could offer me some
> help or suggestions.
>   The command to launch IT test I used is: mvn verify -P integration-tests
> -Dit.test=ITIndexerTest -Ddocker.build.skip=true
>
>
> l  Problem 1. Druid nodes started in docker by integration test start very
> slow. It took about 2-3 minutes until the nodes were ready to accept http
> request from test. Following coordinator log shows that it took very long
> during first 3 log lines. Other nodes were also the same.
>
> Listening for transport dt_socket at address: 5006
> 2021-02-23T10:28:22,874 INFO [main]
> org.hibernate.validator.internal.util.Version - HV01: Hibernate
> Validator 5.2.5.Final
> 2021-02-23T10:29:19,442 WARN [main]
> org.apache.hadoop.util.NativeCodeLoader - Unable to load native-hadoop
> library for your platform... using builtin-java classes where applicable
> 2021-02-23T10:29:42,797 INFO [main] org.apache.curator.utils.Compatibility
> - Running in ZooKeeper 3.4.x compatibility mode
> 2021-02-23T10:29:42,839 INFO [main] org.apache.curator.utils.Compatibility
> - Using emulated InjectSessionExpiration
> 2021-02-23T10:29:47,396 INFO [main]
> org.apache.druid.server.emitter.EmitterModule - Using emitter
> [NoopEmitter{}] for metrics and alerts, with dimensions
> [{version=0.21.0-SNAPSHOT}].
>
> This problem greatly causes long time to wait for execution of a test. The
> problem also exists when the nodes are started by manually executing
> docker-compose -f docker-compose.yml command.
>
>
> l  Problem 2.  There are lots of stack trace messages on the console
> showing that “Unable to canonicalize address 127.0.0.1/:2181
> because it's not resolvable”
>
> 2021-02-23T10:31:43,722 WARN [main-SendThread(127.0.0.1:2181)]
> org.apache.zookeeper.ClientCnxn - Session 0x0 for server 
> 127.0.0.1/:2181,
> unexpected error, closing socket connection and attempting reconnect
> java.lang.IllegalArgumentException: Unable to canonicalize address
> 127.0.0.1/:2181 because it's not resolvable
> at
> org.apache.zookeeper.SaslServerPrincipal.getServerPrincipal(SaslServerPrincipal.java:65)
> ~[zookeeper-3.4.14.jar:3.4.14-4c25d480e66aadd371de8bd2fd8da255ac140bcf]
> at
> org.apache.zookeeper.SaslServerPrincipal.getServerPrincipal(SaslServerPrincipal.java:41)
> ~[zookeeper-3.4.14.jar:3.4.14-4c25d480e66aadd371de8bd2fd8da255ac140bcf]
> at
> org.apache.zookeeper.ClientCnxn$SendThread.startConnect(ClientCnxn.java:1001)
> ~[zookeeper-3.4.14.jar:3.4.14-4c25d480e66aadd371de8bd2fd8da255ac140bcf]
> at
> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1060)
> [zookeeper-3.4.14.jar:3.4.14-4c25d480e66aadd371de8bd2fd8da255ac140bcf]
>
> What happened and how to solve it so that the console is clean enough for
> me to find the real exception thrown by IT test ?
>
>
> l  Problem 3. Since my test failed, I wanted to check task logs to dig out
> what happed. But I did not find task logs in docker so I tried to visit
> router at http://localhost: to see if task log could be found in
> 'Ingestion Tab' , but router requires username and password to
> authenticate. So, what is the username and password to access router web
> page ?
>
>
https://github.com/apache/druid/blob/master/integration-tests/docker/environment-configs/common#L32
- This is described in the environment setup for the integration tests.
Could you update the README with instructions for how to find this type of
information.

>
> l  Problem 4. How to starts and debug an integration test in IntelliJ ? I
> tried, but it showed in console saying
>

This should help you with instructions on how to debug an integration test
https://github.com/apache/druid/tree/master/integration-tests#debugging-druid-while-running-tests


>
>
> com.google.inject.ProvisionException: Unable to provision, see the
> following errors:
>
> 1) Error in custom provider, java.lang.NullPointerException: must
> specify a trustStorePath
>
>   It seems that some configurations are missing to start the test in
> IntelliJ.
>
>
> Any help or response from you is greatly appreciated.
>
>
>


Help wanted when adding integration tests

2021-02-23 Thread Chen Frank
Hi all,

  I am fixing a Druid bug and is adding some integration test, but is blocked 
by some integration test problems which I did not find any solution from 
existing doc.
Since I do not have enough time to investigate these problems deeply, I am 
writing this email for the purpose of hoping any of you could offer me some 
help or suggestions.
  The command to launch IT test I used is: mvn verify -P integration-tests 
-Dit.test=ITIndexerTest -Ddocker.build.skip=true


l  Problem 1. Druid nodes started in docker by integration test start very 
slow. It took about 2-3 minutes until the nodes were ready to accept http 
request from test. Following coordinator log shows that it took very long 
during first 3 log lines. Other nodes were also the same.

Listening for transport dt_socket at address: 5006
2021-02-23T10:28:22,874 INFO [main] 
org.hibernate.validator.internal.util.Version - HV01: Hibernate Validator 
5.2.5.Final
2021-02-23T10:29:19,442 WARN [main] org.apache.hadoop.util.NativeCodeLoader - 
Unable to load native-hadoop library for your platform... using builtin-java 
classes where applicable
2021-02-23T10:29:42,797 INFO [main] org.apache.curator.utils.Compatibility - 
Running in ZooKeeper 3.4.x compatibility mode
2021-02-23T10:29:42,839 INFO [main] org.apache.curator.utils.Compatibility - 
Using emulated InjectSessionExpiration
2021-02-23T10:29:47,396 INFO [main] 
org.apache.druid.server.emitter.EmitterModule - Using emitter [NoopEmitter{}] 
for metrics and alerts, with dimensions [{version=0.21.0-SNAPSHOT}].

This problem greatly causes long time to wait for execution of a test. The 
problem also exists when the nodes are started by manually executing 
docker-compose -f docker-compose.yml command.


l  Problem 2.  There are lots of stack trace messages on the console showing 
that “Unable to canonicalize address 127.0.0.1/:2181 because it's 
not resolvable”

2021-02-23T10:31:43,722 WARN [main-SendThread(127.0.0.1:2181)] 
org.apache.zookeeper.ClientCnxn - Session 0x0 for server 
127.0.0.1/:2181, unexpected error, closing socket connection and 
attempting reconnect
java.lang.IllegalArgumentException: Unable to canonicalize address 
127.0.0.1/:2181 because it's not resolvable
at 
org.apache.zookeeper.SaslServerPrincipal.getServerPrincipal(SaslServerPrincipal.java:65)
 ~[zookeeper-3.4.14.jar:3.4.14-4c25d480e66aadd371de8bd2fd8da255ac140bcf]
at 
org.apache.zookeeper.SaslServerPrincipal.getServerPrincipal(SaslServerPrincipal.java:41)
 ~[zookeeper-3.4.14.jar:3.4.14-4c25d480e66aadd371de8bd2fd8da255ac140bcf]
at 
org.apache.zookeeper.ClientCnxn$SendThread.startConnect(ClientCnxn.java:1001) 
~[zookeeper-3.4.14.jar:3.4.14-4c25d480e66aadd371de8bd2fd8da255ac140bcf]
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1060) 
[zookeeper-3.4.14.jar:3.4.14-4c25d480e66aadd371de8bd2fd8da255ac140bcf]

What happened and how to solve it so that the console is clean enough for me to 
find the real exception thrown by IT test ?


l  Problem 3. Since my test failed, I wanted to check task logs to dig out what 
happed. But I did not find task logs in docker so I tried to visit router at 
http://localhost: to see if task log could be found in 'Ingestion Tab' , 
but router requires username and password to authenticate. So, what is the 
username and password to access router web page ?


l  Problem 4. How to starts and debug an integration test in IntelliJ ? I 
tried, but it showed in console saying



com.google.inject.ProvisionException: Unable to provision, see the following 
errors:

1) Error in custom provider, java.lang.NullPointerException: must specify a 
trustStorePath

  It seems that some configurations are missing to start the test in 
IntelliJ.


Any help or response from you is greatly appreciated.