[ https://issues.apache.org/jira/browse/BAHIR-186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16760364#comment-16760364 ]
ASF GitHub Bot commented on BAHIR-186: -------------------------------------- ckadner commented on issue #75: [BAHIR-186] Reduce size of sql-cloudant test database URL: https://github.com/apache/bahir/pull/75#issuecomment-460471642 @emlaver -- your changes look good and help speeding up the test execution 👍 I do have an issue that may predate your changes though. After running `mvn clean install` and a few subsequent `mvn test -pl sql-cloudant` I keep running into the following test error in `CloudantChangesDFSuite`: ``` - load data and verify deleted doc is not in results *** FAILED *** org.apache.bahir.cloudant.common.CloudantException: Error retrieving _changes feed data from database 'n_flight' with response code 401: {"error":"unauthorized","reason":"Name or password is incorrect."} ``` When I run only the `CloudantChangesDFSuite` independently I do not get that error. ``` $ mvn scalatest:test -pl sql-cloudant -Dsuites='org.apache.bahir.cloudant.CloudantChangesDFSuite' Run starting. Expected test count is: 9 CloudantChangesDFSuite: Sql-cloudant tests that require Cloudant databases have been enabled by the environment variables CLOUDANT_USER and CLOUDANT_PASSWORD. - load and save data from Cloudant database - load and count data from Cloudant search index - load data and verify deleted doc is not in results - load data and count rows in filtered dataframe - save filtered dataframe to database - save dataframe to database using createDBOnSave=true option - load and count data from view - load data from view with MapReduce function - load data and verify total count of selector, filter, and view option Run completed in 2 minutes, 40 seconds. Total number of tests run: 9 Suites: completed 1, aborted 0 Tests: succeeded 9, failed 0, canceled 0, ignored 0, pending 0 All tests passed. ``` Maven log (console output) for all unit tests in SQL-Cloudant: ``` $ mvn test -pl sql-cloudant ... [INFO] --- scalatest-maven-plugin:1.0:test (test) @ spark-sql-cloudant_2.11 --- Discovery starting. Sql-cloudant tests that require Cloudant databases have been enabled by the environment variables CLOUDANT_USER and CLOUDANT_PASSWORD. ... CloudantChangesDFSuite: - load and save data from Cloudant database - load and count data from Cloudant search index - load data and verify deleted doc is not in results *** FAILED *** org.apache.bahir.cloudant.common.CloudantException: Error retrieving _changes feed data from database 'n_flight' with response code 401: {"error":"unauthorized","reason":"Name or password is incorrect."} at org.apache.bahir.cloudant.DefaultSource.create(DefaultSource.scala:162) at org.apache.bahir.cloudant.DefaultSource.createRelation(DefaultSource.scala:95) at org.apache.bahir.cloudant.DefaultSource.createRelation(DefaultSource.scala:87) at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:341) at org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:239) at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:227) at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:174) at org.apache.bahir.cloudant.CloudantChangesDFSuite$$anonfun$8.apply$mcV$sp(CloudantChangesDFSuite.scala:67) at org.apache.bahir.cloudant.CloudantChangesDFSuite$$anonfun$8.apply(CloudantChangesDFSuite.scala:62) at org.apache.bahir.cloudant.CloudantChangesDFSuite$$anonfun$8.apply(CloudantChangesDFSuite.scala:62) ... - load data and count rows in filtered dataframe - save filtered dataframe to database - save dataframe to database using createDBOnSave=true option - load and count data from view - load data from view with MapReduce function - load data and verify total count of selector, filter, and view option ... Run completed in 3 minutes, 34 seconds. Total number of tests run: 25 Suites: completed 6, aborted 0 Tests: succeeded 24, failed 1, canceled 0, ignored 0, pending 0 *** 1 TEST FAILED *** [INFO] ------------------------------------------------------------------------ [INFO] BUILD FAILURE ``` However, when I take a look at the `sql-cloudant/target/unit-tests.log` I don't find any errors logged during the execution of that test case: ``` ===== TEST OUTPUT FOR org.apache.bahir.cloudant.CloudantChangesDFSuite: 'load data and verify deleted doc is not in results' ===== 19/02/04 15:07:11.114 ScalaTest-main-running-CloudantChangesDFSuite INFO SharedState: loading hive config file: jar:file:/Users/ckadner/.m2/repository/org/apache/spark/spark-sql_2.11/2.3.2/spark-sql_2.11-2.3.2-tests.jar!/hive-site.xml 19/02/04 15:07:11.130 ScalaTest-main-running-CloudantChangesDFSuite INFO SharedState: Setting hive.metastore.warehouse.dir ('null') to the value of spark.sql.warehouse.dir ('file:/Users/ckadner/Projects/bahir_apache_merges/sql-cloudant/spark-warehouse/'). 19/02/04 15:07:11.130 ScalaTest-main-running-CloudantChangesDFSuite INFO SharedState: Warehouse path is 'file:/Users/ckadner/Projects/bahir_apache_merges/sql-cloudant/spark-warehouse/'. 19/02/04 15:07:11.132 ScalaTest-main-running-CloudantChangesDFSuite INFO StateStoreCoordinatorRef: Registered StateStoreCoordinator endpoint 19/02/04 15:07:11.288 ScalaTest-main-running-CloudantChangesDFSuite INFO DefaultSource: Loading data from Cloudant using https://testy-spark-connector.cloudant.com/n_flight/_changes?include_docs=true&feed=normal&seq_interval=200&timeout=60000&filter=_selector 19/02/04 15:07:11.292 streaming-start INFO ReceiverTracker: Starting 1 receivers 19/02/04 15:07:11.292 streaming-start INFO ReceiverTracker: ReceiverTracker started 19/02/04 15:07:11.292 streaming-start INFO PluggableInputDStream: Slide time = 8000 ms 19/02/04 15:07:11.293 streaming-start INFO PluggableInputDStream: Storage level = Memory Deserialized 1x Replicated 19/02/04 15:07:11.293 streaming-start INFO PluggableInputDStream: Checkpoint interval = null 19/02/04 15:07:11.293 streaming-start INFO PluggableInputDStream: Remember interval = 8000 ms 19/02/04 15:07:11.293 streaming-start INFO PluggableInputDStream: Initialized and validated org.apache.spark.streaming.dstream.PluggableInputDStream@3fb41f46 19/02/04 15:07:11.293 streaming-start INFO ForEachDStream: Slide time = 8000 ms 19/02/04 15:07:11.293 streaming-start INFO ForEachDStream: Storage level = Serialized 1x Replicated 19/02/04 15:07:11.293 streaming-start INFO ForEachDStream: Checkpoint interval = null 19/02/04 15:07:11.293 streaming-start INFO ForEachDStream: Remember interval = 8000 ms 19/02/04 15:07:11.293 streaming-start INFO ForEachDStream: Initialized and validated org.apache.spark.streaming.dstream.ForEachDStream@14cb05dd 19/02/04 15:07:11.293 streaming-start INFO RecurringTimer: Started timer for JobGenerator at time 1549321632000 19/02/04 15:07:11.294 streaming-start INFO JobGenerator: Started JobGenerator at 1549321632000 ms 19/02/04 15:07:11.294 streaming-start INFO JobScheduler: Started JobScheduler 19/02/04 15:07:11.294 ScalaTest-main-running-CloudantChangesDFSuite INFO StreamingContext: StreamingContext started 19/02/04 15:07:11.311 dag-scheduler-event-loop INFO DAGScheduler: Got job 0 (start at DefaultSource.scala:155) with 1 output partitions 19/02/04 15:07:11.311 dag-scheduler-event-loop INFO DAGScheduler: Final stage: ResultStage 0 (start at DefaultSource.scala:155) 19/02/04 15:07:11.311 dag-scheduler-event-loop INFO DAGScheduler: Parents of final stage: List() 19/02/04 15:07:11.311 dag-scheduler-event-loop INFO DAGScheduler: Missing parents: List() 19/02/04 15:07:11.311 dispatcher-event-loop-2 INFO ReceiverTracker: Receiver 0 started 19/02/04 15:07:11.311 dag-scheduler-event-loop INFO DAGScheduler: Submitting ResultStage 0 (Receiver 0 ParallelCollectionRDD[1] at makeRDD at ReceiverTracker.scala:613), which has no missing parents 19/02/04 15:07:11.327 dag-scheduler-event-loop INFO MemoryStore: Block broadcast_0 stored as values in memory (estimated size 62.3 KB, free 1638.5 MB) 19/02/04 15:07:11.329 dag-scheduler-event-loop INFO MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 21.8 KB, free 1638.5 MB) 19/02/04 15:07:11.329 dispatcher-event-loop-3 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on ckadner-mbp.hsd1.ca.comcast.net:62424 (size: 21.8 KB, free: 1638.6 MB) 19/02/04 15:07:11.329 dag-scheduler-event-loop INFO SparkContext: Created broadcast 0 from broadcast at DAGScheduler.scala:1039 19/02/04 15:07:11.330 dag-scheduler-event-loop INFO DAGScheduler: Submitting 1 missing tasks from ResultStage 0 (Receiver 0 ParallelCollectionRDD[1] at makeRDD at ReceiverTracker.scala:613) (first 15 tasks are for partitions Vector(0)) 19/02/04 15:07:11.330 dag-scheduler-event-loop INFO TaskSchedulerImpl: Adding task set 0.0 with 1 tasks 19/02/04 15:07:11.332 dispatcher-event-loop-0 INFO TaskSetManager: Starting task 0.0 in stage 0.0 (TID 0, localhost, executor driver, partition 0, PROCESS_LOCAL, 9000 bytes) 19/02/04 15:07:11.332 Executor task launch worker for task 0 INFO Executor: Running task 0.0 in stage 0.0 (TID 0) 19/02/04 15:07:11.341 Executor task launch worker for task 0 INFO RecurringTimer: Started timer for BlockGenerator at time 1549321631400 19/02/04 15:07:11.341 Executor task launch worker for task 0 INFO BlockGenerator: Started BlockGenerator 19/02/04 15:07:11.342 Thread-35 INFO BlockGenerator: Started block pushing thread 19/02/04 15:07:11.342 dispatcher-event-loop-0 INFO ReceiverTracker: Registered receiver for stream 0 from ckadner-mbp.hsd1.ca.comcast.net:62423 19/02/04 15:07:11.342 Executor task launch worker for task 0 INFO ReceiverSupervisorImpl: Starting receiver 0 19/02/04 15:07:11.342 Executor task launch worker for task 0 INFO ReceiverSupervisorImpl: Called receiver 0 onStart 19/02/04 15:07:11.342 Executor task launch worker for task 0 INFO ReceiverSupervisorImpl: Waiting for receiver to be stopped 19/02/04 15:07:12.003 JobGenerator INFO JobScheduler: Added jobs for time 1549321632000 ms 19/02/04 15:07:12.003 JobScheduler INFO JobScheduler: Starting job streaming job 1549321632000 ms.0 from job set of time 1549321632000 ms 19/02/04 15:07:12.051 streaming-job-executor-0 INFO SparkContext: Starting job: json at DefaultSource.scala:150 19/02/04 15:07:12.052 streaming-job-executor-0 INFO DAGScheduler: Job 1 finished: json at DefaultSource.scala:150, took 0.000025 s 19/02/04 15:07:12.063 dispatcher-event-loop-1 INFO ReceiverTracker: Sent stop signal to all 1 receivers 19/02/04 15:07:12.063 dispatcher-event-loop-2 INFO ReceiverSupervisorImpl: Received stop signal 19/02/04 15:07:12.063 dispatcher-event-loop-2 INFO ReceiverSupervisorImpl: Stopping receiver with message: Stopped by driver: 19/02/04 15:07:12.063 dispatcher-event-loop-2 INFO ReceiverSupervisorImpl: Called receiver onStop 19/02/04 15:07:12.063 dispatcher-event-loop-2 INFO ReceiverSupervisorImpl: Deregistering receiver 0 19/02/04 15:07:12.064 dispatcher-event-loop-3 ERROR ReceiverTracker: Deregistered receiver for stream 0: Stopped by driver 19/02/04 15:07:12.064 dispatcher-event-loop-2 INFO ReceiverSupervisorImpl: Stopped receiver 0 19/02/04 15:07:12.064 dispatcher-event-loop-2 INFO BlockGenerator: Stopping BlockGenerator 19/02/04 15:07:12.402 dispatcher-event-loop-2 INFO RecurringTimer: Stopped timer for BlockGenerator after time 1549321632400 19/02/04 15:07:12.402 dispatcher-event-loop-2 INFO BlockGenerator: Waiting for block pushing thread to terminate 19/02/04 15:07:12.406 Thread-35 INFO BlockGenerator: Pushing out the last 0 blocks 19/02/04 15:07:12.406 Thread-35 INFO BlockGenerator: Stopped block pushing thread 19/02/04 15:07:12.406 dispatcher-event-loop-2 INFO BlockGenerator: Stopped BlockGenerator 19/02/04 15:07:12.406 Executor task launch worker for task 0 INFO ReceiverSupervisorImpl: Stopped receiver without error 19/02/04 15:07:12.407 Executor task launch worker for task 0 INFO Executor: Finished task 0.0 in stage 0.0 (TID 0). 751 bytes result sent to driver 19/02/04 15:07:12.407 task-result-getter-0 INFO TaskSetManager: Finished task 0.0 in stage 0.0 (TID 0) in 1076 ms on localhost (executor driver) (1/1) 19/02/04 15:07:12.407 task-result-getter-0 INFO TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks have all completed, from pool 19/02/04 15:07:12.408 dag-scheduler-event-loop INFO DAGScheduler: ResultStage 0 (start at DefaultSource.scala:155) finished in 1.096 s 19/02/04 15:07:12.408 streaming-job-executor-0 INFO ReceiverTracker: All of the receivers have deregistered successfully 19/02/04 15:07:12.408 streaming-job-executor-0 INFO ReceiverTracker: ReceiverTracker stopped 19/02/04 15:07:12.408 streaming-job-executor-0 INFO JobGenerator: Stopping JobGenerator immediately 19/02/04 15:07:12.409 streaming-job-executor-0 INFO RecurringTimer: Stopped timer for JobGenerator after time 1549321632000 19/02/04 15:07:12.409 streaming-job-executor-0 INFO JobGenerator: Stopped JobGenerator 19/02/04 15:07:12.687 Spark Context Cleaner INFO ContextCleaner: Cleaned accumulator 470 <<19/02/04 15:07:12. ... Spark Context Cleaner INFO ContextCleaner: Cleaned accumulator ...>> 19/02/04 15:07:12.688 Spark Context Cleaner INFO ContextCleaner: Cleaned accumulator 479 19/02/04 15:07:12.690 dispatcher-event-loop-3 INFO BlockManagerInfo: Removed broadcast_0_piece0 on ckadner-mbp.hsd1.ca.comcast.net:62424 in memory (size: 21.8 KB, free: 1638.6 MB) 19/02/04 15:07:12.693 Spark Context Cleaner INFO ContextCleaner: Cleaned accumulator 467 <<19/02/04 15:07:12. ... Spark Context Cleaner INFO ContextCleaner: Cleaned accumulator ...>> 19/02/04 15:07:12.693 Spark Context Cleaner INFO ContextCleaner: Cleaned accumulator 486 19/02/04 15:07:14.412 streaming-job-executor-0 INFO JobScheduler: Stopped JobScheduler 19/02/04 15:07:14.413 streaming-job-executor-0 INFO StreamingContext: StreamingContext stopped successfully 19/02/04 15:07:14.415 ScalaTest-main-running-CloudantChangesDFSuite INFO CloudantChangesDFSuite: ===== FINISHED org.apache.bahir.cloudant.CloudantChangesDFSuite: 'load data and verify deleted doc is not in results' ===== ``` I do however find that error message earlier in the log during the test case `incorrect password throws an error message for changes receiver` where it is expected: ``` ===== TEST OUTPUT FOR org.apache.bahir.cloudant.CloudantOptionSuite: 'incorrect password throws an error message for changes receiver' ===== ... 19/02/04 15:06:32.751 Cloudant Receiver ERROR CookieInterceptor: Credentials are incorrect for server https://testy-spark-connector.cloudant.com:443/_session, cookie authentication will not be attempted again by this interceptor object 19/02/04 15:06:33.909 Cloudant Receiver WARN ReceiverSupervisorImpl: Reported error Error retrieving _changes feed data from database 'n_flight' with response code 401: {"error":"unauthorized","reason":"Name or password is incorrect."} - org.apache.bahir.cloudant.common.CloudantException: Error retrieving _changes feed data from database 'n_flight' with response code 401: {"error":"unauthorized","reason":"Name or password is incorrect."} 19/02/04 15:06:33.912 dispatcher-event-loop-1 WARN ReceiverTracker: Error reported by receiver for stream 0: Error retrieving _changes feed data from database 'n_flight' with response code 401: {"error":"unauthorized","reason":"Name or password is incorrect."} - org.apache.bahir.cloudant.common.CloudantException: Error retrieving _changes feed data from database 'n_flight' with response code 401: {"error":"unauthorized","reason":"Name or password is incorrect."} at org.apache.bahir.cloudant.internal.ChangesReceiver.org$apache$bahir$cloudant$internal$ChangesReceiver$$receive(ChangesReceiver.scala:79) at org.apache.bahir.cloudant.internal.ChangesReceiver$$anon$1.run(ChangesReceiver.scala:37) ... 19/02/04 15:06:44.445 ScalaTest-main-running-CloudantOptionSuite INFO CloudantOptionSuite: ===== FINISHED org.apache.bahir.cloudant.CloudantOptionSuite: 'incorrect password throws an error message for changes receiver' ===== ``` When I look at the sources here https://github.com/apache/bahir/blob/c51853d135ad2d9da67804259f4ed0e29223afb3/sql-cloudant/src/main/scala/org/apache/bahir/cloudant/DefaultSource.scala#L159-L163 ...it appears that this error is a fluke? The test case `load data and verify deleted doc is not in results` expects the schema to be empty but the `DefaultSource` treats that as an error and throws up the last `receiverErrorMsg` which in this case comes from the earlier run test case `incorrect password throws an error message for changes receiver`. At least I thought that at first sight, but the test case does not seem to expect an empty schema https://github.com/apache/bahir/blob/c51853d135ad2d9da67804259f4ed0e29223afb3/sql-cloudant/src/test/scala/org/apache/bahir/cloudant/CloudantChangesDFSuite.scala#L67-L69 So I am not sure what is happening. But you might? :-) ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Support SSL connection in MQTT SQL Streaming > -------------------------------------------- > > Key: BAHIR-186 > URL: https://issues.apache.org/jira/browse/BAHIR-186 > Project: Bahir > Issue Type: New Feature > Components: Spark Streaming Connectors > Reporter: Lukasz Antoniak > Assignee: Lukasz Antoniak > Priority: Major > Fix For: Spark-2.4.0 > > > Mailing list discussion: > https://www.mail-archive.com/user@bahir.apache.org/msg00022.html. -- This message was sent by Atlassian JIRA (v7.6.3#76005)