Slack digest for #general - 2020-01-10

Apache Pulsar Slack Fri, 10 Jan 2020 01:11:36 -0800

2020-01-09 09:16:15 UTC - Lukas Chripko: Hi guys, I'm trying to build python 
pulsar-client on alpine linux I feel like I'm almost done but I stuck on one of 
the last steps when I need to link _pulsar.so file which gives me:
```[ 96%] Linking CXX shared library _pulsar.so
/usr/lib/gcc/x86_64-alpine-linux-musl/9.2.0/../../../../x86_64-alpine-linux-musl/bin/ld:
 /usr/local/lib/libboost_python3.a(list.o): relocation R_X86_64_PC32 against 
undefined symbol `PyList_Type' can not be used when making a shared object; 
recompile with -fPIC
/usr/lib/gcc/x86_64-alpine-linux-musl/9.2.0/../../../../x86_64-alpine-linux-musl/bin/ld:
 final link failed: bad value
collect2: error: ld returned 1 exit status
make[2]: *** [python/CMakeFiles/_pulsar.dir/build.make:230: python/_pulsar.so] 
Error 1
make[1]: *** [CMakeFiles/Makefile2:604: python/CMakeFiles/_pulsar.dir/all] 
Error 2
make: *** [Makefile:130: all] Error 2```
it says I should recompile with -fPIC but that is already the case. As I do 
build the boost in the same way as on the CentOS release:
```./b2 address-model=64 cxxflags=-fPIC link=static threading=multi 
variant=release install```
Any ideas what else could be wrong/changed? Or is it possible to link the files 
manually?
----
2020-01-09 09:21:09 UTC - Danish Mohd: @Danish Mohd has joined the channel
----
2020-01-09 09:23:18 UTC - Danish Mohd: Hello Everyone,
Greetings!
I could not find the Wikipedia page for Pulsar. should we create one?
----
2020-01-09 10:11:56 UTC - Roman Popenov: I know there is 
<https://pulsar.apache.org/docs/en/standalone/>
----
2020-01-09 10:12:06 UTC - Roman Popenov: I mostly find information through the 
website
----
2020-01-09 10:40:11 UTC - Julius.b: Am i getting this right, if i use Apache 
Avro Schemas while creating a producer in python, it creates an exception if 
the type is not the same as in the class.
`ValueError: 'hi' (type &lt;class 'str'&gt;) do not match ['null', 'int']`
With Json Schemas though, i can pass any data type without getting an exception.
Can i get the same exceptions for a json schema?
----
2020-01-09 11:22:52 UTC - Fernando: I’m having a problem where after running 
functions in `localrun` mode, the function is still running even though the 
process has stopped (at least I think). If I list all functions it’s empty. How 
can I inspect the `function` processes running and properly shut it down?
----
2020-01-09 13:33:49 UTC - Roman Popenov: In the end, I think I will go with the 
smaller cluster and then compare the huge cluster that is “suggested” in the 
helm chart and let Amazon Auto-scaler manage the resources
----
2020-01-09 13:38:32 UTC - Naby: I am trying to connect Pulsar to InfluxDB 
built-in connector and I am wondering if I should set up function worker first 
or can I go ahead connecting them without function worker setup, since I can’t 
figure out how to set up function worker. Any suggestion on how to set it up or 
where I can find the documentation?
----
2020-01-09 13:39:23 UTC - Roman Popenov: It didn’t seem like it mattered in 
which order you run them
----
2020-01-09 13:39:41 UTC - Roman Popenov: I usually run the function because I 
don’t want the topics to “clutter” because my cluster is rather small
----
2020-01-09 13:40:51 UTC - Roman Popenov: 
<https://pulsar.apache.org/docs/en/functions-worker/> - good place to start
----
2020-01-09 13:45:36 UTC - Naby: Does this mean I need to setup function worker 
separately or running a built-in connector triggers that and I don’t need to do 
any additional work?
----
2020-01-09 13:47:26 UTC - Roman Popenov: From my understanding, you only need 
to provide a configuration for the connector and it will sink/source to 
destination
----
2020-01-09 13:48:28 UTC - Naby: I’ll check that. Thanks.
----
2020-01-09 13:48:42 UTC - Roman Popenov: For the function worker, you have to 
set it up if you want further processing of the data from/to topics
+1 : Naby
----
2020-01-09 13:49:13 UTC - Roman Popenov: There are three runtime configuration 
for the function worker :
<https://pulsar.apache.org/docs/en/functions-runtime/>
----
2020-01-09 15:00:47 UTC - vikash: Hello All,
I am getting the following an exception in pulsar-io-JDBC sink connector when 
sending the message from .net client 
producer-to-pulsar-functions-to-pulsarIOjdbcSink(wherein connector I am using 
schema Type is Avro)


And Even from javaclientProducer(function outputtopic)-to-pulsar-io-jdbc(using 
functions output topic)
pulsar function:
context.newOutputMessage(publishTopic, 
Schema.AVRO(EntitiesTableInfo.class)).value(inputObject).sendAsync();
console log:
============
20:15:28.553 [main] INFO  org.apache.pulsar.functions.LocalRunner - 
RuntimeSpawner quit because of
com.google.common.util.concurrent.*UncheckedExecutionException: 
java.lang.NullPointerException*
        at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2050) 
~[com.google.guava-guava-25.1-jre.jar:?]

But When I creating a new topic and sending the message from java client 
producer to pulsar-io-JDBC connector not getting any exception messages sinking.
Any help.what is the cause behind the exception?
----
2020-01-09 15:33:31 UTC - Sijie Guo: localrun is not managed by pulsar brokers 
(function workers)
----
2020-01-09 15:33:41 UTC - Sijie Guo: you can manually kill the process running 
local run.
----
2020-01-09 15:34:40 UTC - Sijie Guo: I am not sure if .net client support 
schema.
----
2020-01-09 15:35:04 UTC - Sijie Guo: it seems that the messages don’t carry any 
schema version causing null pointer exception
----
2020-01-09 15:37:03 UTC - Sijie Guo: you can use localrun to run a connector 
without setting a function worker
----
2020-01-09 15:38:48 UTC - Sijie Guo: yes. avro does schema validation for the 
messages produced by a certain schema. json only collect schema information but 
not validating individuall messages. (because json is schemaless)
----
2020-01-09 15:38:58 UTC - Sijie Guo: @Danish Mohd yes we should
----
2020-01-09 15:44:04 UTC - vikash: but  i see schema function  having version
----
2020-01-09 15:44:06 UTC - vikash: ```{
  "version": 0,
  "schemaInfo": {
    "name": "my2-function-output-topic",
    "schema": {
      "type": "record",
      "name": "EntitiesTableInfo",
      "namespace": "org.apache.pulsar.functions.api.publishexample",
      "fields": [
        {
          "name": "ENTITYID",
          "type": [
            "null",
            "string"
          ]
        },
        {
          "name": "ENTITYTYPE",
          "type": [
            "null",
            "string"
          ]
        },
        {
          "name": "ENTITYINFOJSON",
          "type": [
            "null",
            "string"
          ]
        },
        {
          "name": "TENANTNAME",
          "type": [
            "null",
            "string"
          ]
        },
        {
          "name": "TENANTID",
          "type": [
            "null",
            "string"
          ]
        },
        {
          "name": "FABRIC",
          "type": [
            "null",
            "string"
          ]
        },
        {
          "name": "ENTITYNAME",
          "type": [
            "null",
            "string"
          ]
        },
        {
          "name": "SEQUENCENUMBER",
          "type": "double"
        },
        {
          "name": "VALIDFROM",
          "type": [
            "null",
            "string"
          ]
        },
        {
          "name": "VALIDTO",
          "type": [
            "null",
            "string"
          ]
        }
      ]
    },
    "type": "AVRO",
    "properties": {
      "__alwaysAllowNull": "true"
    }
  }
}```
----
2020-01-09 15:58:15 UTC - vikash: And here is the override 
function&lt;String,Void&gt; process method in this one mention POJO Avro 
schema..
`@Override`
`public Void process(String input, Context context) {`
`String publishTopic = (String) 
context.getUserConfigValueOrDefault("PublishTopic", 
"my2-function-output-topic");`
`Gson gson = new Gson();`
`EntitiesTableInfo inputObject = gson.fromJson(input, 
EntitiesTableInfo.class);`   
`try {`                 
        `context.newOutputMessage(publishTopic, 
Schema.AVRO(EntitiesTableInfo.class)).value(inputObject).sendAsync();`          
                        
`} catch (PulsarClientException e) {`                   
          `context.getLogger().error(e.toString());`
`}`           
`return null;`
`}`
----
2020-01-09 16:09:03 UTC - Vladimir Shchur: @Sijie Guo does it mean that if 
producer doesn't send any schema pulsar connectors become unusable?
----
2020-01-09 16:16:37 UTC - Sijie Guo: the schema at broker side is fine. the 
problem is the schema version is not attached to the messages correctly. so 
when JDBC connector receives the message, it can’t find a schema version 
associated with the message and throws NullPointerException.
----
2020-01-09 16:16:46 UTC - Sijie Guo: It is a .NET client issue.
----
2020-01-09 16:17:19 UTC - Sijie Guo: similar as the c++/python issue I pointed 
to you before @vikash
----
2020-01-09 16:21:54 UTC - Vladimir Shchur: @vikash looks like you defined the 
schema on broker side while .net client doesn't support schema for now
----
2020-01-09 16:22:45 UTC - Vladimir Shchur: maybe you can try using the 
connector without schema, but I'm not sure it will work
----
2020-01-09 16:27:42 UTC - vikash: I think yes... it will not work without 
schema because on pulsor-io-jdbc connector for the topic exception on connector 
instance start
```java.util.concurrent.CompletionException: 
org.apache.pulsar.client.api.PulsarClientException$NotFoundException: No latest 
schema found for topic my4-function-output-topic```
----
2020-01-09 16:31:19 UTC - Sijie Guo: JDBC connector requires schema 
unfortunately
----
2020-01-09 16:39:50 UTC - vikash: I want to send the message from .net only 
...is there any way to sink into jdbc if any changes need to do in connector 
side ...i means custom connector(modifying something like Sink interface 
implements in abstract class or write() method logic... )... I don't know
----
2020-01-09 21:39:46 UTC - Sergii Zhevzhyk: Hello everyone! I am developing a 
source connector for Pulsar which publishes AVRO messages. Unfortunately, I 
experience some kind of inconsistent behavior depending on if I run the 
connector using the `localrun` or `create` commands. When the `localrun` 
command is used, the correct original AVRO schema is registered in the schema 
registry. When I try to run the same connector with the same configuration but 
using the `create` command, the registered schema is different and does not 
correspond to the original AVRO schema.
I have created a sample project with code and readme on how to reproduce the 
issue <https://github.com/vzhikserg/pulsar-connector-localrun-vs-create>.
I am wondering if this problem is related to this issue :thinking_face: 
<https://github.com/apache/pulsar/issues/3762>
Any recommendations (or workarounds) on how to solve this issue would be 
greatly appreciated. Thank you in advance!
----
2020-01-09 21:42:24 UTC - Sijie Guo: hmm it looks to be related.
----
2020-01-09 21:42:46 UTC - Sijie Guo: Are you able to collect the schemas 
created by both approaches?
----
2020-01-09 21:46:58 UTC - Sergii Zhevzhyk: I have added the schemas (output 
from `pulsar-admin schemas get`) to readme in the sample project. Is this what 
you're asking?
----
2020-01-09 22:23:42 UTC - Isaiah Rairdon: Hey hopefully this is an easy 
Q&amp;A. We are running pulsar in Kubernetes and came across a question as to 
what folders need to be persistent to external disk for bookkeeper. Currently 
we had data/bookkeeper/ledgers and data/bookkeeper/journals mounted to external 
disks. But found another folder for ranges inside that bookkeeper folder. Does 
that also need to be mounted to external disk or can that folder be re-created 
like other stateless folders?
----
2020-01-09 22:28:18 UTC - Sijie Guo: I think that is from state storage. that 
can be recraeted.
----
2020-01-09 22:45:21 UTC - Isaiah Rairdon: thank you
----
2020-01-10 02:05:29 UTC - juraj: in values.yaml, under the `configData:` of the 
given component, add `PULSAR_LOG_LEVEL: "debug"`
----
2020-01-10 04:47:47 UTC - Nisha Singla: @Nisha Singla has joined the channel
----
2020-01-10 06:25:53 UTC - Leo Baker-Hytch: @Leo Baker-Hytch has joined the 
channel
----
2020-01-10 06:49:05 UTC - Julius.b: :+1: Thanks for help
----
2020-01-10 08:18:04 UTC - curdin: @curdin has joined the channel
----
2020-01-10 08:28:10 UTC - Fabien LD: github repo contains some yaml files for 
deployment on kuberntes that we are using as a reference -&gt; 
<https://github.com/apache/pulsar/tree/master/deployment/kubernetes>
----
2020-01-10 08:43:03 UTC - curdin: hey there, if you need to compare an event 
sequence (`Event_X` with `id="abc"` happened right after `Event_X`  with 
`id="xyz"`), there is no way to query the `Event_X` topic right? only with 
pulsar sql which is not available in a java client from what i saw
----

Slack digest for #general - 2020-01-10

Reply via email to