Re: [libvirt] libvirt will wait 20 minutes or hang when the network interface down

2012-12-24 Thread Benjamin Wang (gendwang)
Hi Michal,
  In most cases the thread will wait about 20 minutes. I think that 20 minutes 
is not acceptable when a router which connects a lot of devices is down. Most 
important, some thread could hang. I am not sure whether
this is a curl bug. But they propose to use these two options to fix this 
problem.

B.R.
Benjamin Wang

-Original Message-
From: Michal Privoznik [mailto:mpriv...@redhat.com] 
Sent: 2012年12月24日 16:57
To: Benjamin Wang (gendwang)
Cc: libvir-list@redhat.com; James Ye (jiaye); Yang Zhou (yangzho)
Subject: Re: [libvirt] libvirt will wait 20 minutes or hang when the network 
interface down

On 22.12.2012 09:59, Benjamin Wang (gendwang) wrote:
 Hi,
 
   I find that when the network interface is down. In most scenarios, 
 the libvirt will wait 20 minutes and report the exception. In seldom 
 scenarios, the polling
 
 thread will hang even if the network is recovered.
 
 The following is the formal description from libcurl website:
 
 http://curl.haxx.se/docs/faq.html  (Section “4.19 Why doesn't cURL 
 return an error when the network cable is unplugged?”)
 
 The following is the similar case about thread hand:
 
 http://curl.haxx.se/mail/lib-2010-07/0108.html
 
  
 
 For “wait 20 minutes”, although this is the TCP normal mechanism, but 
 if a server manages tons of thousands of devices by libvirt. When 1000 
 devices are down,
 
 This could cause thread leak for a long period.
 
 For thread hang, this could cause thread leak forever.
 
  
 
 I tried to add the following codes in esx_vi.c. It seems that these 
 code can avoid the above issues. Would you give your comments?
 
  
 
 *if*(curl-headers == NULL) {
 
 virReportError(VIR_ERR_INTERNAL_ERROR, %s,
 
_(Could not build CURL header list));
 
 *return*-1;
 
 }
 
  
 
 +curl_easy_setopt(curl-handle, CURLOPT_LOW_SPEED_LIMIT, 10);
 
 +curl_easy_setopt(curl-handle, CURLOPT_LOW_SPEED_TIME, 120);
 
  
 
 curl_easy_setopt(curl-handle, CURLOPT_USERAGENT, 
 _libvirt_-_esx_);
 
 curl_easy_setopt(curl-handle, CURLOPT_HEADER, 0);
 
 curl_easy_setopt(curl-handle, CURLOPT_FOLLOWLOCATION, 0);
 
 curl_easy_setopt(curl-handle, CURLOPT_SSL_VERIFYPEER,
 
  
 
  
 
 B.R.
 
 Benjamin Wang
 

I wonder if this isn't a curl bug actually since it (must) know interface's 
down. That is, i think curl_easy_perform() which is wrapped in 
esxVI_CURL_Perform() should have returned an error.

Michal


--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list

[libvirt] libvirt will wait 20 minutes or hang when the network interface down

2012-12-22 Thread Benjamin Wang (gendwang)
Hi,
  I find that when the network interface is down. In most scenarios, the 
libvirt will wait 20 minutes and report the exception. In seldom scenarios, the 
polling
thread will hang even if the network is recovered.
The following is the formal description from libcurl website:
http://curl.haxx.se/docs/faq.html  (Section 4.19 Why doesn't cURL return an 
error when the network cable is unplugged?)
The following is the similar case about thread hand:
http://curl.haxx.se/mail/lib-2010-07/0108.html

For wait 20 minutes, although this is the TCP normal mechanism, but if a 
server manages tons of thousands of devices by libvirt. When 1000 devices are 
down,
This could cause thread leak for a long period.
For thread hang, this could cause thread leak forever.

I tried to add the following codes in esx_vi.c. It seems that these code can 
avoid the above issues. Would you give your comments?

if (curl-headers == NULL) {
virReportError(VIR_ERR_INTERNAL_ERROR, %s,
   _(Could not build CURL header list));
return -1;
}

+curl_easy_setopt(curl-handle, CURLOPT_LOW_SPEED_LIMIT, 10);
+curl_easy_setopt(curl-handle, CURLOPT_LOW_SPEED_TIME, 120);

curl_easy_setopt(curl-handle, CURLOPT_USERAGENT, libvirt-esx);
curl_easy_setopt(curl-handle, CURLOPT_HEADER, 0);
curl_easy_setopt(curl-handle, CURLOPT_FOLLOWLOCATION, 0);
curl_easy_setopt(curl-handle, CURLOPT_SSL_VERIFYPEER,


B.R.
Benjamin Wang
--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list

[libvirt] Connection release is not correct in libvirt and libvrt java

2012-12-18 Thread Benjamin Wang (gendwang)
Hi,
  The following is the current code to release connection in libvirt.
int
virConnectClose(virConnectPtr conn)
{
...

if (!VIR_IS_CONNECT(conn)) {
virLibConnError(VIR_ERR_INVALID_CONN, __FUNCTION__);
goto error;
}
...
error:
virDispatchError(NULL);
return ret;
}

Now if the cable is unplugged and the application call virConnectClose to 
release connection, the code will enter into the error procedure, the connection
Can't be released. I have changed the following two parts to fix this issue. 
Please give your comments:
Changed Code1:
int
virConnectClose(virConnectPtr conn)
{
...

+if(NULL == conn) {
+return 0;
+}

...


-if (!VIR_IS_CONNECT(conn)) {
-virLibConnError(VIR_ERR_INVALID_CONN, __FUNCTION__);
-goto error;
-}
...

error:
   virDispatchError(NULL);
return ret;
}

Changed Code2:
int
virUnrefConnect(virConnectPtr conn) {
...
+if(NULL == conn) {
+return 0;
+}

-if ((!VIR_IS_CONNECT(conn))) {
-virLibConnError(VIR_ERR_INVALID_ARG, _(no connection));
-return -1;
-}
...
}


For libvirt java, there are similar issue. I have changed code as following in 
Collect.java. Please also give your comments.
public int close() throws LibvirtException {
int success = 0;
if (VCP != null) {
+try {
 success = libvirt.virConnectClose(VCP);
processError();
+}
+finally {
// If leave an invalid pointer dangling around JVM crashes and 
burns
// if someone tries to call a method on us
// We rely on the underlying libvirt error handling to detect 
that
// it's called with a null virConnectPointer
VCP = null;
 +   }
}
return success;
}


B.R.
Benjamin Wang
--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list

Re: [libvirt] JNA Error Callback could cause core dump.

2012-10-19 Thread Benjamin Wang (gendwang)
Hi,
  I am using JNA 3.4.1. The problem is caused by libvirt java. You are right.

B.R.
Benjamin Wang

-Original Message-
From: Claudio Bley [mailto:cb...@av-test.de] 
Sent: 2012年10月19日 19:36
To: Benjamin Wang (gendwang)
Cc: libvir-list@redhat.com; Guannan Ren; Daniel Veillard; Yang Zhou (yangzho)
Subject: Re: JNA Error Callback could cause core dump.

 BW == Benjamin Wang (gendwang) gendw...@cisco.com writes:

BW Hi, When I changed code as following:

BW public class Connect {

BW // Load the native part

BW static {

BW Libvirt.INSTANCE.virInitialize();

BW try {

BW ErrorHandler.processError(Libvirt.INSTANCE);

BW } catch (Exception e) {

BW e.printStackTrace();

BW }

BW   + Libvirt.INSTANCE.virSetErrorFunc(null, new
BW ErrorCallback());

BW }

BW The problem was caused that when JNA call setErrorFunc, it
BW will create ErrorCallback object. But when GC is executed, the
BW object is GCed.

Yes, that's why you should keep a reference to the object around.

BW But even I change code as following.

BW When GC is excuted, the callback object will be moved. Then C
BW can’t find this object. Both of scenarios will cause core
BW dump. It seems that JNA mustn’t provide ErrorCallback Class,

First off, JNA does not provide this class, it is provided by the libvirt-java 
wrapper.

Which version of JNA did you use? As I said in a previous mail, I had crashes 
with JNA  3.4.2. Consequently, I cannot reproduce the crash using your code, 
JNA 3.4.2 and with having this series
(https://www.redhat.com/archives/libvir-list/2012-October/msg00578.html)
applied. (at least patch #15 is needed when using JNA 3.4.2)
--
AV-Test GmbH, Klewitzstr. 7, 39112 Magdeburg, Germany
Phone: +49 391 6075466, Fax: +49 391 6075469
Web: http://www.av-test.org

Eingetragen am / Registered at: Amtsgericht Stendal (HRB 114076) 
Geschaeftsfuehrer (CEO): Andreas Marx, Guido Habicht, Maik Morgenstern

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list

[libvirt] JNA Error Callback could cause core dump.

2012-10-18 Thread Benjamin Wang (gendwang)
Hi,
  When I changed code as following:
public class Connect {
// Load the native part
static {
Libvirt.INSTANCE.virInitialize();
try {
ErrorHandler.processError(Libvirt.INSTANCE);
} catch (Exception e) {
e.printStackTrace();
}

  + Libvirt.INSTANCE.virSetErrorFunc(null, new ErrorCallback());
}

The server will generate the following core dump:
Program terminated with signal 6, Aborted.
#0  0x003f9b030265 in raise () from /lib64/libc.so.6
(gdb) where
#0  0x003f9b030265 in raise () from /lib64/libc.so.6
#1  0x003f9b031d10 in abort () from /lib64/libc.so.6
#2  0x003f9b06a84b in __libc_message () from /lib64/libc.so.6
#3  0x003f9b07230f in _int_free () from /lib64/libc.so.6
#4  0x003f9b07276b in free () from /lib64/libc.so.6
#5  0x2cf46868 in ?? ()
#6  0x in ?? ()


The problem was caused that when JNA call setErrorFunc, it will create 
ErrorCallback object. But when GC is executed, the object is GCed. But even I 
change code as following.
When GC is excuted, the callback object will be moved. Then C can't find this 
object. Both of scenarios will cause core dump. It seems that JNA mustn't 
provide ErrorCallback Class,
Because nobody can use this.
Please correct me.

public class Connect {
  +  private static final ErrorCallback callback = new ErrorCallback();

// Load the native part
static {
Libvirt.INSTANCE.virInitialize();
try {
ErrorHandler.processError(Libvirt.INSTANCE);
} catch (Exception e) {
e.printStackTrace();
}

  + Libvirt.INSTANCE.virSetErrorFunc(null, callback);
}



B.R.
Benjamin Wang
--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list

Re: [libvirt] Memory free in libvirt JNA

2012-10-11 Thread Benjamin Wang (gendwang)
Hi Claudio,
   Sorry for my late response.
   I have gone through Claudio's solution. It's good. But I think this is not a 
common solution. There are two points:
1. This solution must use Pointerbyreference to encapsulate the Pointer. This 
is not clean.
2. Libvirt provides virFree method. But a common library could not provide 
memory management functions.

My proposal is as following:
1. Add a new Class Libc.java
public interface Libc  extends Library{
Libc INSTANCE = (Libc) Native.loadLibrary(c, Libc.class);

public void free(Pointer p);
}

2. Transfer the following code as following:
 public SchedParameter[] getSchedulerParameters() throws LibvirtException {
 IntByReference nParams = new IntByReference();
 SchedParameter[] returnValue = new SchedParameter[0];
-String scheduler = libvirt.virDomainGetSchedulerType(VDP, nParams);
+Pointer pScheduler = libvirt.virDomainGetSchedulerType(VDP, nParams);
 processError();
-if (scheduler != null) {
+if (pScheduler != null) {
+String scheduler = pScheduler.getString(0);
+libc.free(pScheduler);
 virSchedParameter[] nativeParams = new 
virSchedParameter[nParams.getValue()];
 returnValue = new SchedParameter[nParams.getValue()];
 libvirt.virDomainGetSchedulerParameters(VDP, nativeParams, 
nParams);


What about your opinion?

B.R.
Benjamin Wang




-Original Message-
From: Claudio Bley [mailto:cb...@av-test.de] 
Sent: 2012年10月8日 20:33
To: veill...@redhat.com
Cc: Benjamin Wang (gendwang); libvir-list@redhat.com; Yang Zhou (yangzho)
Subject: Re: [libvirt] Memory free in libvirt JNA

Hi Daniel,

At Fri, 28 Sep 2012 22:34:13 +0800,
Daniel Veillard wrote:
 
 sorry for the delay, I need to focuse one something else ATM !

Me too. So, no worries! ;)

 First do you have a small pointer indicating where in JNA that kind of 
 native deallocation must take place, since most of the time JNA can do 
 the marshalling all by itself ?

This effects mostly Strings. JNA takes the safe assumption that functions are 
returning const char*s because it can't distinguish a string (const char*) 
from a string (char*). See 
https://github.com/twall/jna/blob/master/www/FrequentlyAskedQuestions.md#how-do-i-read-back-a-functions-string-result

So, here is a list of methods of org.libvirt.jna.Libvirt which return a string 
(/probably/ a char*, not const char*) which need to be
checked:

virConnectBaselineCPU
virConnectDomainXMLFromNative
virConnectDomainXMLToNative
virConnectFindStoragePoolSources
virConnectGetCapabilities
virConnectGetHostname
virConnectGetType
virConnectGetURI
virDomainGetName
virDomainGetOSType
virDomainGetXMLDesc
virDomainSnapshotGetXMLDesc
virInterfaceGetMACString
virInterfaceGetName
virInterfaceGetXMLDesc
virNWFilterGetName
virNWFilterGetXMLDesc
virNetworkGetBridgeName
virNetworkGetName
virNetworkGetXMLDesc
virNodeDeviceGetName
virNodeDeviceGetParent
virNodeDeviceGetXMLDesc
virSecretGetUsageID
virSecretGetXMLDesc
virStoragePoolGetName
virStoragePoolGetXMLDesc
virStorageVolGetKey
virStorageVolGetName
virStorageVolGetPath
virStorageVolGetXMLDesc

 And second would you have an idea how to systematically detect such 
 leaks, the kind of loop suggested to expose the issue is nor really 
 practical to chase the leaks ...

I tried valgrind, but it didn't produce any output. mtrace wasn't very helpful 
either.

So, I just hacked this up:

,[ memcheck.py ]
| import gdb
| 
| allocations = {}
| 
| class AllocBreak(gdb.FinishBreakpoint):
| def stop(self):
| global allocations
| 
| if self.return_value != None:
| callstack = []
| frame = gdb.selected_frame()
| 
| while frame:
| name = frame.name()
| func = frame.function()
| sal = frame.find_sal()
| 
| funcname = func.print_name if func else '?'
| line = sal.line
| filename = sal.symtab.filename if sal.symtab else '?'
| 
| callstack.append((name, filename, line, funcname))
| 
| frame = frame.older()
| 
| addr = int(str(self.return_value), 16)
| allocations[addr] = callstack
| 
| 
| class MemAlloc (gdb.Command):
| Track allocations.
| def __init__(self):
| super(MemAlloc, self).__init__(memalloc, gdb.COMMAND_NONE)
| 
| def invoke(self, arg, from_tty):
| top = gdb.selected_frame()
| frame = top.older()
| 
| if frame:
| func = frame.function()
| 
| if func: #  and func.name.startswith(virAlloc):
| ab = AllocBreak(top, True)
| ab.silent = True
| 
| 
| class MemFree(gdb.Command):
| Track de-allocations.
| def __init__(self):
| super(MemFree, self).__init__(memfree, gdb.COMMAND_NONE)
| 
| def invoke(self, arg, from_tty):
| global allocations

Re: [libvirt] Memory free in libvirt JNA

2012-10-11 Thread Benjamin Wang (gendwang)
Hi Claudio,
   Thanks for you informing about new JNA version. I try to use JNA provided 
API. It works well. The updated code is as following:

 public SchedParameter[] getSchedulerParameters() throws LibvirtException {
 IntByReference nParams = new IntByReference();
 SchedParameter[] returnValue = new SchedParameter[0];
-String scheduler = libvirt.virDomainGetSchedulerType(VDP, nParams);
+Pointer pScheduler = libvirt.virDomainGetSchedulerType(VDP, 
+ nParams);
 processError();
-if (scheduler != null) {
+if (pScheduler != null) {
+String scheduler = pScheduler.getString(0);
+Native.free(Pointer.nativeValue(pScheduler));
 virSchedParameter[] nativeParams = new 
virSchedParameter[nParams.getValue()];
 returnValue = new SchedParameter[nParams.getValue()];
 libvirt.virDomainGetSchedulerParameters(VDP, nativeParams, 
nParams);

If there is no issue. I recommend to use this solution to enhance all JNA code.

BTW:
Not all the returned String should be freed by JNA. For example,  In Domain 
Class, the result returned by the method getName/ getUUIDString
can't be freed. Because the reference were not allocated temporarily. We must 
analyze case by case.

B.R.
Benjamin Wang


-Original Message-
From: Claudio Bley [mailto:cb...@av-test.de] 
Sent: 2012年10月11日 23:36
To: Benjamin Wang (gendwang)
Cc: veill...@redhat.com; libvir-list@redhat.com; Yang Zhou (yangzho)
Subject: Re: [libvirt] Memory free in libvirt JNA

At Thu, 11 Oct 2012 08:37:23 +,
Benjamin Wang (gendwang) wrote:
 
 Hi Claudio,
Sorry for my late response.
I have gone through Claudio's solution. It's good. But I think this is not 
 a common solution. There are two points:
 1. This solution must use Pointerbyreference to encapsulate the
Pointer. This is not clean.

Yes, as I said, this adds another level of indirection --- which is pretty 
useless in Java.

 2. Libvirt provides virFree method. But a common library could not
provide memory management functions.

Sorry, I don't get your point here.

 My proposal is as following:
 1. Add a new Class Libc.java
 public interface Libc  extends Library{
   Libc INSTANCE = (Libc) Native.loadLibrary(c, Libc.class);
   
   public void free(Pointer p);
 }

Not every platform has a shared library called c. On Windows this would be 
msvcrt.dll for the Microsoft runtime. So, you would need to branch on the 
platform to load whatever library seems appropriate.

Also, I just discovered that since version 3.3.0 JNA provides a public free 
method itself.

Since I get crashes when using callback functions with JNA 3.2.7 in certain 
circumstances it is better just to require a newer version of JNA, IMHO.

I'll post a few patches with improvements and memory fixes tomorrow.

--
AV-Test GmbH, Henricistraße 20, 04155 Leipzig, Germany
Phone: +49 341 265 310 19
Web:http://www.av-test.org

Eingetragen am / Registered at: Amtsgericht Stendal (HRB 114076) 
Geschaeftsfuehrer (CEO): Andreas Marx, Guido Habicht, Maik Morgenstern

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list

Re: [libvirt] Core dump caused by misusing openssl in multithread scenario!

2012-10-08 Thread Benjamin Wang (gendwang)
-Original Message-
From: Matthias Bolte [mailto:matthias.bo...@googlemail.com] 
Sent: 2012年10月7日 2:14
To: Benjamin Wang (gendwang)
Cc: Daniel P. Berrange; libvir-list@redhat.com; Yang Zhou (yangzho)
Subject: Re: [libvirt] Core dump caused by misusing openssl in multithread 
scenario!

2012/10/2 Benjamin Wang (gendwang) gendw...@cisco.com:
 Hi Daniel,
Is this problem fixed in the latest version? What about the question 2 
 which related to openssl callbacks in multi-thread?

As Daniel said, we cannot assume that libcurl was build with OpenSSL backend. 
We would need some way to detect this first.
[Benjamin]: I agree. But if libcurl want to access ESXi by https. OpenSSL will 
be used. And libvirt must call 
CRYPTO_set_id_callback/CRYPTO_set_locking_callback
to support multi-threads

Also, wasn't there a license problem with OpenSSL and the (L)GPL? Can libvirt 
legally be used with a libcurl that is linked with OpenSSL?
[Benjamin]: I think there is no open source license issue. We will not change 
libcurl or openssl source code. What we needed is to call openssl 
API(CRYPTO_set_id_callback/CRYPTO_set_locking_callback) 
to support multi-threads.

--
Matthias Bolte
http://photron.blogspot.com

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list

Re: [libvirt] Core dump caused by misusing openssl in multithread scenario!

2012-10-02 Thread Benjamin Wang (gendwang)
Hi Daniel,
   Is this problem fixed in the latest version? What about the question 2 which 
related to openssl callbacks in multi-thread?

B.R.
Benjamin Wang

-Original Message-
From: Daniel P. Berrange [mailto:berra...@redhat.com] 
Sent: 2012年10月2日 16:02
To: Benjamin Wang (gendwang)
Cc: libvir-list@redhat.com; Yang Zhou (yangzho)
Subject: Re: [libvirt] Core dump caused by misusing openssl in multithread 
scenario!

On Tue, Oct 02, 2012 at 02:57:46AM +, Benjamin Wang (gendwang) wrote:
 Hi Daniel,
My comments are as following:
 1. Currently curl_easy_init method is called from esxVI_CURL_Connect 
 method in esx_vi.c. And curl_global_init method is called by curl_easy_init. 
 If we move Curl_global_init to virInitialize, shall we still need to call 
 curl_easy_init from esxVI_CURL_Connect? Did the latest version fix this 
 problem?

That is actually the problem. The CURL docs explicitly tell you that it is 
*not* safe to rely on curl_easy_init in a multithreaded program. You must call 
curl_global_init explicitly.

Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list

Re: [libvirt] Two core dumps are generated in multi-thread scenarios

2012-10-01 Thread Benjamin Wang (gendwang)
Hi Matthias,
   This can't be reproduced 100%. I reproduce this case twice. But when I set 
the CURLOPT_NOSIGNAL to 1. I didn't find the similar
core again. And it seems that everything works well. What do you mean  stuck 
in a DNS lookup? 

B.R.
Benjamin Wang

-Original Message-
From: Matthias Bolte [mailto:matthias.bo...@googlemail.com] 
Sent: 2012年9月30日 4:20
To: Benjamin Wang (gendwang)
Cc: libvir-list@redhat.com; Yang Zhou (yangzho)
Subject: Re: Two core dumps are generated in multi-thread scenarios

2012/9/23 Benjamin Wang (gendwang) gendw...@cisco.com:
 Hi,
   I found two core dumps generated in multi-thread scenarios in ESX part.

 Case1: libcurl support multi-thread
 core dump:
 #12 0x2aaabea89712 in addbyter () from /usr/local/lib/libcurl.so.4
 #13 0x2aaabea89b86 in dprintf_formatf () from
 /usr/local/lib/libcurl.so.4
 #14 0x2aaabea8b055 in curl_mvsnprintf () from
 /usr/local/lib/libcurl.so.4
 #15 0x2aaabea7678f in Curl_failf () from 
 /usr/local/lib/libcurl.so.4
 #16 0x2aaabea6d871 in Curl_resolv_timeout () from
 /usr/local/lib/libcurl.so.4
 #17 0x0006e8a8f230 in ?? ()

 Fix code:
 esxVI_CURL_Connect() in esx_vi.c:
 I add a new line as following:
 curl_easy_setopt(curl-handle, CURLOPT_NOSIGNAL, 1);

It took me a moment reading libcurl code until I figured out what might be 
happening here. The problem is that Curl_resolv_timeout uses SIGALRM + 
sigsetjmp/siglongjmp to realize the timeout logic. This implementation is not 
thread-safe as the SIGALRM might be executed on a different thread than the 
original thread that started the call to Curl_resolv_timeout. This in turn 
results in the call to Curl_resolv_timeout being continued via siglongjmp 
(called from the SIGALRM handler) on different thread. Setting CURLOPT_NOSIGNAL 
to 1 makes libcurl avoid the SIGALRM + sigsetjmp/siglongjmp implementation.
This solves the problem but with the cost of losing the timeout capability.

In your case a DNS lookup took longer than libcurl was willing to wait and a 
timeout aborted it. But the call to Curl_failf (as part of the timeout error 
handling) was made on the wrong thread (I think) making it segfault. IMHO there 
is no ideal solution here, because with CURLOPT_NOSIGNAL set to 0 (the default) 
libcurl can realize DNS lookup with timeout, but the error handling might occur 
on the wrong thread.
But with CURLOPT_NOSIGNAL set to 1 the segfault is avoided but libcurl might 
get stuck in a DNS lookup.

Are you able to reproduce this problem and can you confirm that setting 
CURLOPT_NOSIGNAL to 1 fixes it?

--
Matthias Bolte
http://photron.blogspot.com

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list

Re: [libvirt] Core dump caused by misusing openssl in multithread scenario!

2012-10-01 Thread Benjamin Wang (gendwang)
Hi Daniel,
   My comments are as following:
1. Currently curl_easy_init method is called from esxVI_CURL_Connect method in 
esx_vi.c. And curl_global_init method is called by curl_easy_init. If we move
Curl_global_init to virInitialize, shall we still need to call curl_easy_init 
from esxVI_CURL_Connect? Did the latest version fix this problem?
2. If we need to use openssl in multi-threads, we must register the two 
callbacks. Currently libcurl didn't do it. If we will not register these two 
callbacks in libvirt,
How to do?

B.R.
Benjamin Wang

-Original Message-
From: Daniel P. Berrange [mailto:berra...@redhat.com] 
Sent: 2012年10月1日 16:24
To: Benjamin Wang (gendwang)
Cc: libvir-list@redhat.com; Yang Zhou (yangzho)
Subject: Re: [libvirt] Core dump caused by misusing openssl in multithread 
scenario!

On Sat, Sep 29, 2012 at 01:31:07PM +, Benjamin Wang (gendwang) wrote:
 Hi,
   I am running libvirt with ESXi driver in multithread scenario to access 
 ESXi by https. Sometimes a core dump will be generated as following:
 #0  0x003f9b030265 in raise () from /lib64/libc.so.6
 #1  0x003f9b031d10 in abort () from /lib64/libc.so.6
 #2  0x003f9b06a84b in __libc_message () from /lib64/libc.so.6
 #3  0x003f9b072fae in _int_malloc () from /lib64/libc.so.6
 #4  0x003f9b074cde in malloc () from /lib64/libc.so.6
 #5  0x003f9b07963b in strerror () from /lib64/libc.so.6
 #6  0x003fa188032a in ERR_load_ERR_strings () from 
 /lib64/libcrypto.so.6
 #7  0x003fa187fde9 in ERR_load_crypto_strings () from 
 /lib64/libcrypto.so.6
 #8  0x003fa48309d9 in SSL_load_error_strings () from 
 /lib64/libssl.so.6
 #9  0x2aaaba8e612e in Curl_ossl_init () from 
 /opt/CSCOppm-unit/hypervisor/libcurl/lib/libcurl.so.4
 #10 0x2aaaba8ee6c1 in curl_global_init () from 
 /opt/CSCOppm-unit/hypervisor/libcurl/lib/libcurl.so.4
 #11 0x2aaaba8ee6f8 in curl_easy_init () from 
 /opt/CSCOppm-unit/hypervisor/libcurl/lib/libcurl.so.4
 #12 0x2aaaba0d932b in esxVI_SessionIsActive (ctx=0x2aaac093ca80, 
 sessionID=0x2aaac06932a0 `3i\300\252*, userName=0x2aaac0ae6e80 
 root, output=0x) at 
 esx/esx_vi_methods.generated.c:599
 #13 0x2aaaba0c7a60 in esxStorageVolumeLookupByKey (conn=0x7412, 
 key=0x76c1 Address 0x76c1 out of bounds) at 
 esx/esx_storage_driver.c:825
 
 I checked that currently ESXi driver didn't initialize openssl. 
 Because libcurl will not handle openssl for multi-thread. According to 
 openssl API, libvirt should

No code in libvirt should assume curl uses openssl - it may well have been 
compiled with gnutls, or nss instead. The actual flaw here is that libvirt does 
not invoke 'curl_global_init' from virInitialize.

Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list

[libvirt] Potential race condition problem

2012-09-29 Thread Benjamin Wang (gendwang)
Hi,
   Currently virInitialize() method defined in libvirt.c has the following code:
int
virInitialize(void)
{
if (initialized)
return 0;

initialized = 1;

if (virThreadInitialize()  0 ||
virErrorInitialize()  0 ||
virRandomInitialize(time(NULL) ^ getpid()) ||
virNodeSuspendInit()  0)
return -1;

..
}

When two threads access virInitialize method, there is no lock for the 
initialized parameter. If the first thread enters this method and set 
initialized to 1,
the second thread could see that initialized is 1(Because initialized is not 
volatiled, I say could). In some situation, before the first thread finishes 
all the initialization,
the second thread could use some resources which should be initialized in 
Initialize method.
If you have any comments, please let me know. Thanks!

B.R.
Benjamin Wang
--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list

Re: [libvirt] Potential race condition problem

2012-09-29 Thread Benjamin Wang (gendwang)
Hi,
OK. Now I am using JNA to access libvirt. If we add another mutex which used to 
access “initialized” parameter. This mutex must be pthread_mutex_init firstly 
and only once.
But it seems that there is no way to change libvirt code. I do it as following:

1.  Changing libvirt JNA code in Connect.java
Old Code:
public Connect(String uri) throws LibvirtException {
 VCP = libvirt.virConnectOpen(uri);
// Check for an error
processError();
ErrorHandler.processError(Libvirt.INSTANCE);
}

New Code:
public Connect(String uri) throws LibvirtException {
 synchronized(this.getClass()) {
  VCP = libvirt.virConnectOpen(uri);
 }
// Check for an error
processError();
ErrorHandler.processError(Libvirt.INSTANCE);
}

This can make sure only that one thread can execute Connect. For a server 
application, we only need one time. So the performance is OK


2.  Changing libvirt code in libvirt.c
Old Code:
static int initialized = 0;

New Code:
static int volatile initialized = 0;

This can make sure the initialization will be executed once.

Would you give your comments for this solution?

B.R.
Benjamin Wang

From: Guannan Ren [mailto:g...@redhat.com]
Sent: 2012年9月29日 15:43
To: Benjamin Wang (gendwang)
Cc: Daniel Veillard; libvir-list@redhat.com; Yang Zhou (yangzho)
Subject: Re: [libvirt] Potential race condition problem

On 09/29/2012 03:07 PM, Benjamin Wang (gendwang) wrote:
Hi,
   Currently virInitialize() method defined in libvirt.c has the following code:
int
virInitialize(void)
{
if (initialized)
return 0;

initialized = 1;

if (virThreadInitialize()  0 ||
virErrorInitialize()  0 ||
virRandomInitialize(time(NULL) ^ getpid()) ||
virNodeSuspendInit()  0)
return -1;

……
}

When two threads access virInitialize method, there is no lock for the 
“initialized” parameter. If the first thread enters this method and set 
“initialized” to 1,
the second thread could see that “initialized” is 1(Because initialized is not 
volatiled, I say could). In some situation, before the first thread finishes 
all the initialization,
the second thread could use some resources which should be initialized in 
Initialize method.
If you have any comments, please let me know. Thanks!

B.R.
Benjamin Wang


  As the comments above the function said,
  It's better to call this routine at startup in multithreaded 
applications to avoid potential race when initializing the library.


  Guannan

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list

Re: [libvirt] Potential race condition problem

2012-09-29 Thread Benjamin Wang (gendwang)
Hi,
I think you misunderstand my meaning. My solution includes step1 + step2. Step1 
is used to implement thread mutex. Step2 is used to
handle “initialized” visibility. Without step2, the initialization could be 
executed several times.

B.R.
Benjamin Wang

From: Guannan Ren [mailto:g...@redhat.com]
Sent: 2012年9月29日 17:22
To: Benjamin Wang (gendwang)
Cc: Daniel Veillard; libvir-list@redhat.com; Yang Zhou (yangzho); 
cb...@av-test.de
Subject: Re: [libvirt] Potential race condition problem

On 09/29/2012 03:52 PM, Benjamin Wang (gendwang) wrote:
Hi,
OK. Now I am using JNA to access libvirt. If we add another mutex which used to 
access “initialized” parameter. This mutex must be pthread_mutex_init firstly 
and only once.
But it seems that there is no way to change libvirt code. I do it as following:

1.  Changing libvirt JNA code in Connect.java
Old Code:
public Connect(String uri) throws LibvirtException {
 VCP = libvirt.virConnectOpen(uri);
// Check for an error
processError();
ErrorHandler.processError(Libvirt.INSTANCE);
}

New Code:
public Connect(String uri) throws LibvirtException {
 synchronized(this.getClass()) {
  VCP = libvirt.virConnectOpen(uri);
 }
// Check for an error
processError();
ErrorHandler.processError(Libvirt.INSTANCE);
}

This can make sure only that one thread can execute Connect. For a server 
application, we only need one time. So the performance is OK


2.  Changing libvirt code in libvirt.c
Old Code:
static int initialized = 0;

New Code:
static int volatile initialized = 0;

This can make sure the initialization will be executed once.

Would you give your comments for this solution?

B.R.
Benjamin Wang


 As far as I know the operations on volatile variable is not atomic,
  the usage of volatile keyword as a portable synchronization mechanism is 
discouraged by C.
 But in Java, it is a global ordering on the reads and writes to a volatile 
variable.
  So, maybe, your first solution is pretty enough good.

  Guannan

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list

[libvirt] Core dump caused by misusing openssl in multithread scenario!

2012-09-29 Thread Benjamin Wang (gendwang)
Hi,
  I am running libvirt with ESXi driver in multithread scenario to access ESXi 
by https. Sometimes a core dump will be generated as following:
#0  0x003f9b030265 in raise () from /lib64/libc.so.6
#1  0x003f9b031d10 in abort () from /lib64/libc.so.6
#2  0x003f9b06a84b in __libc_message () from /lib64/libc.so.6
#3  0x003f9b072fae in _int_malloc () from /lib64/libc.so.6
#4  0x003f9b074cde in malloc () from /lib64/libc.so.6
#5  0x003f9b07963b in strerror () from /lib64/libc.so.6
#6  0x003fa188032a in ERR_load_ERR_strings () from /lib64/libcrypto.so.6
#7  0x003fa187fde9 in ERR_load_crypto_strings () from /lib64/libcrypto.so.6
#8  0x003fa48309d9 in SSL_load_error_strings () from /lib64/libssl.so.6
#9  0x2aaaba8e612e in Curl_ossl_init () from 
/opt/CSCOppm-unit/hypervisor/libcurl/lib/libcurl.so.4
#10 0x2aaaba8ee6c1 in curl_global_init () from 
/opt/CSCOppm-unit/hypervisor/libcurl/lib/libcurl.so.4
#11 0x2aaaba8ee6f8 in curl_easy_init () from 
/opt/CSCOppm-unit/hypervisor/libcurl/lib/libcurl.so.4
#12 0x2aaaba0d932b in esxVI_SessionIsActive (ctx=0x2aaac093ca80, 
sessionID=0x2aaac06932a0 `3i\300\252*, userName=0x2aaac0ae6e80 root, 
output=0x) at esx/esx_vi_methods.generated.c:599
#13 0x2aaaba0c7a60 in esxStorageVolumeLookupByKey (conn=0x7412, key=0x76c1 
Address 0x76c1 out of bounds) at esx/esx_storage_driver.c:825

I checked that currently ESXi driver didn't initialize openssl. Because libcurl 
will not handle openssl for multi-thread. According to openssl API, libvirt 
should
register two methods to support mutli-threads. The detailed description is as 
following:
http://www.openssl.org/docs/crypto/threads.html

I have changed code as following:

1.  virInitialize() in libvirt.c
Old Code:
int
virInitialize(void)
{
...
virLogSetFromEnv();
virNetTLSInit();
...
}

New Code:
int
virInitialize(void)
{
...
virLogSetFromEnv();
virNetTLSInit();
virOpenSSLInit();
...
}


2.  In virnetServer.c
New Code:
pthread_mutex_t *lock_cs;
long *lock_count;

void virOpenSSLLockCallback(int mode, int type, const char *file 
ATTRIBUTE_UNUSED, int line ATTRIBUTE_UNUSED) {
if (mode  CRYPTO_LOCK)
{
pthread_mutex_lock((lock_cs[type]));
lock_count[type]++;
}
else
{
pthread_mutex_unlock((lock_cs[type]));
}
}

unsigned long virOpenSSLIdCallback(void)
{
unsigned long ret;

ret=(unsigned long)pthread_self();
return(ret);
}

void virOpenSSLInit(void)
{
int i;

lock_cs=OPENSSL_malloc(CRYPTO_num_locks() * sizeof(pthread_mutex_t));
lock_count=OPENSSL_malloc(CRYPTO_num_locks() * sizeof(long));
for (i=0; iCRYPTO_num_locks(); i++)
{
lock_count[i]=0;
pthread_mutex_init((lock_cs[i]),NULL);
}

CRYPTO_set_id_callback(virOpenSSLIdCallback);
CRYPTO_set_locking_callback(virOpenSSLLockCallback);
}


To be honest, virOpenSSLInit/ virOpenSSLIdCallback/ virOpenSSLLockCallback 
should not be defined in this file. But It seems that Makefile generated by 
autoconfig can't
handle the new file recursively.
What about this solution? If you have any comments, please feel free to contact 
me.


BTW: If I add a new source/header file, is there a simple way to change 
Makefile?

B.R.
Benjamin Wang



--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list

Re: [libvirt] Two core dumps are generated in multi-thread scenarios

2012-09-23 Thread Benjamin Wang (gendwang)
Hi,
  Old code(in esx_vi.c) is as below:
curl_easy_setopt(curl-handle, CURLOPT_USERAGENT, libvirt-esx);
curl_easy_setopt(curl-handle, CURLOPT_HEADER, 0);

New code:
curl_easy_setopt(curl-handle, CURLOPT_NOSIGNAL, 1);
curl_easy_setopt(curl-handle, CURLOPT_USERAGENT, libvirt-esx);
curl_easy_setopt(curl-handle, CURLOPT_HEADER, 0);

B.R.
Benjamin Wang

-Original Message-
From: Daniel Veillard [mailto:veill...@redhat.com] 
Sent: 2012年9月23日 16:52
To: Benjamin Wang (gendwang)
Cc: Matthias Bolte; libvir-list@redhat.com; Yang Zhou (yangzho)
Subject: Re: [libvirt] Two core dumps are generated in multi-thread scenarios

On Sun, Sep 23, 2012 at 03:32:52AM +, Benjamin Wang (gendwang) wrote:
 Hi,
   I found two core dumps generated in multi-thread scenarios in ESX part.
 Case1: libcurl support multi-thread
 core dump:
 #12 0x2aaabea89712 in addbyter () from /usr/local/lib/libcurl.so.4
 #13 0x2aaabea89b86 in dprintf_formatf () from 
 /usr/local/lib/libcurl.so.4
 #14 0x2aaabea8b055 in curl_mvsnprintf () from 
 /usr/local/lib/libcurl.so.4
 #15 0x2aaabea7678f in Curl_failf () from 
 /usr/local/lib/libcurl.so.4
 #16 0x2aaabea6d871 in Curl_resolv_timeout () from 
 /usr/local/lib/libcurl.so.4
 #17 0x0006e8a8f230 in ?? ()
 
 Fix code:
 esxVI_CURL_Connect() in esx_vi.c:
 I add a new line as following:
 curl_easy_setopt(curl-handle, CURLOPT_NOSIGNAL, 1);

 Where exactly in the function ? Can you send a diff of your change ?

Daniel



-- 
Daniel Veillard  | libxml Gnome XML XSLT toolkit  http://xmlsoft.org/
dan...@veillard.com  | Rpmfind RPM search engine http://rpmfind.net/ 
http://veillard.com/ | virtualization library  http://libvirt.org/

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list

[libvirt] Two core dumps are generated in multi-thread scenarios

2012-09-22 Thread Benjamin Wang (gendwang)
Hi,
  I found two core dumps generated in multi-thread scenarios in ESX part.
Case1: libcurl support multi-thread
core dump:
#12 0x2aaabea89712 in addbyter () from /usr/local/lib/libcurl.so.4
#13 0x2aaabea89b86 in dprintf_formatf () from /usr/local/lib/libcurl.so.4
#14 0x2aaabea8b055 in curl_mvsnprintf () from /usr/local/lib/libcurl.so.4
#15 0x2aaabea7678f in Curl_failf () from /usr/local/lib/libcurl.so.4
#16 0x2aaabea6d871 in Curl_resolv_timeout () from 
/usr/local/lib/libcurl.so.4
#17 0x0006e8a8f230 in ?? ()

Fix code:
esxVI_CURL_Connect() in esx_vi.c:
I add a new line as following:
curl_easy_setopt(curl-handle, CURLOPT_NOSIGNAL, 1);


Case2: libssl support multi-thread
core dump:
#0  0x003f9b030265 in raise () from /lib64/libc.so.6
#1  0x003f9b031d10 in abort () from /lib64/libc.so.6
#2  0x003f9b06a84b in __libc_message () from /lib64/libc.so.6
#3  0x003f9b072fae in _int_malloc () from /lib64/libc.so.6
#4  0x003f9b074cde in malloc () from /lib64/libc.so.6
#5  0x003f9b07963b in strerror () from /lib64/libc.so.6
#6  0x003fa188032a in ERR_load_ERR_strings () from /lib64/libcrypto.so.6
#7  0x003fa187fde9 in ERR_load_crypto_strings () from /lib64/libcrypto.so.6
#8  0x003fa48309d9 in SSL_load_error_strings () from /lib64/libssl.so.6
#9  0x2aaaba8e612e in Curl_ossl_init () from 
/opt/CSCOppm-unit/hypervisor/libcurl/lib/libcurl.so.4
#10 0x2aaaba8ee6c1 in curl_global_init () from 
/opt/CSCOppm-unit/hypervisor/libcurl/lib/libcurl.so.4
#11 0x2aaaba8ee6f8 in curl_easy_init () from 
/opt/CSCOppm-unit/hypervisor/libcurl/lib/libcurl.so.4
#12 0x2aaaba0d932b in esxVI_RegisterVM_Task (ctx=0x2aaaba0d96d1, 
_this=0x5cf54b20, path=0x50e921c0 10.74.125.50, name=0x2aaac0ae6e80 root, 
asTemplate=3228119712, pool=0x5cf54b20, host=0x2aaac0693270, output=0x50e921a0)
at esx/esx_vi_methods.generated.c:480

Possible Problem:
Two callback functions(locking_function and threadid_func) need to be set.
http://www.openssl.org/docs/crypto/threads.html#DESCRIPTION


Would you help to give some comments about this two core dump?

B.R.
Benjamin Wang

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list

Re: [libvirt] Libvir JNA report SIGSEGV

2012-09-10 Thread Benjamin Wang (gendwang)
Hi,
The problem is located. The root cause is that 
esxConnectToHost/esxConnectToVCenter method defined in esx_driver.c will 
collect the username and password. JNA will allocate the memory for username 
and password. But esxConnectToHost/esxConnectToVCenter will free the memory 
allocated by Java as following defined in esxConnectToHost/esxConnectToVCenter:

VIR_FREE(username);
VIR_FREE(unescapedPassword);

When JVM run the GC, it will crash because of dual free. If I comment these two 
lines, the system works well. But I think this is not a good solution. What 
about your opinion?

B.R.
Benjamin Wang




-Original Message-
From: Benjamin Wang (gendwang) 
Sent: 2012年9月6日 21:43
To: 'veill...@redhat.com'
Cc: libvir-list@redhat.com
Subject: RE: [libvirt] Libvir JNA report SIGSEGV

Hi,
   I have looked into the code for several days. But I didn't find the root 
cause. Because even if I only call new Connect, the problem will occur. So 
this should be related to Connect.java or ConnectAuthDefault.java. Would you 
take a quick at the issue and give some prompt?
Then I can try to fix this. Thanks!

B.R.
Benjamin Wang

-Original Message-
From: Daniel Veillard [mailto:veill...@redhat.com]
Sent: 2012年9月6日 19:05
To: Benjamin Wang (gendwang)
Cc: libvir-list@redhat.com
Subject: Re: [libvirt] Libvir JNA report SIGSEGV

On Thu, Sep 06, 2012 at 09:06:14AM +, Benjamin Wang (gendwang) wrote:
 Hi,
   Actually I also did another test as following. When I comment the 
 new Connet, the program works well. So this is the problem related 
 to Libvirt JNA. If I manually run the garbage collection for this program, it 
 still works well. But if I run the garbage collection for the last problem, 
 It will crash. I guess this problem is caused by ConnectAuth callback. When 
 garbage collection is executed, the callback memory is moved.

  Okay, maybe some memory need to be pinned in some ways, I take patches !

Daniel

-- 
Daniel Veillard  | libxml Gnome XML XSLT toolkit  http://xmlsoft.org/
dan...@veillard.com  | Rpmfind RPM search engine http://rpmfind.net/ 
http://veillard.com/ | virtualization library  http://libvirt.org/

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list

Re: [libvirt] Memory free in libvirt JNA

2012-09-09 Thread Benjamin Wang (gendwang)
Hi,
   I wrote a code to verify the memory leak problem as following.
C code in so:
void checkJNAMemLeak1(int **head, int *length)
{
long i = 0;

*head = (int *)malloc(sizeof(int) * 1);
for(i=0; i1; i++)
{
(*head)[i] = 1;
}

*length = 1;
}

Java code:
public static void testJNAMemLeak1()
{
PointerByReference head = new PointerByReference();
IntByReference length = new IntByReference();

while(true)
{
libben.checkJNAMemLeak1(head, length);
System.out.println(length.getValue());
sleep(1);

}
}

When we check memory by top command, the virt and res will increase very 
quickly. When we check with jconsole, there is no memory in Java heap. Even I 
execute GC manually by jconsole. Nothing happen.

If I change java code as following:
public static void testJNAMemLeak1()
{
PointerByReference head = new PointerByReference();
IntByReference length = new IntByReference();

while(true)
{
libben.checkJNAMemLeak1(head, length);
System.out.println(length.getValue());
sleep(1);

libc.free(head.getValue());
}
}

public static void testJNAMemLeak1()
{
PointerByReference head = new PointerByReference();
IntByReference length = new IntByReference();

while(true)
{
libben.checkJNAMemLeak1(head, length);
System.out.println(length.getValue());
sleep(1);

libc.free(head.getValue());
}
}

Then everything works well. The virt and res will not increase.
I think we must provide the free functions for all the memory allocated by 
libvirt.

B.R.
Benjamin Wang


-Original Message-
From: Benjamin Wang (gendwang) 
Sent: 2012年9月7日 15:22
To: libvir-list@redhat.com
Cc: 'veill...@redhat.com'; Yang Zhou (yangzho)
Subject: RE: Memory free in libvirt JNA

Hi,
Overview Part of JNA API describes as following:
1. Description1:
If the native method returns char* and actually allocates memory, a return type 
of Pointer should be used to avoid leaking the memory. It is then up to you to 
take the necessary steps to free the allocated memory.

2. Description2:
Declare the method as returning a Structure of the appropriate type, then 
invoke Structure.toArray(int) to convert to an array of initialized structures 
of the appropriate size. Note that your Structure class must have a no-args 
constructor, and you are responsible for freeing the returned memory if 
applicable in whatever way is appropriate for the called function.

And the example code shows as following:
// Original C code
struct Display* get_displays(int* pcount); void free_displays(struct Display* 
displays);

// Equivalent JNA mapping
Display get_displays(IntByReference pcount); void free_displays(Display[] 
displays); ...
IntByReference pcount = new IntByReference(); Display d = 
lib.get_displays(pcount); Display[] displays = 
(Display[])d.toArray(pcount.getValue());
...
lib.free_displays(displays);


That's to say. All the memory allocated by native code must be freed explicitly 
in JNA part. We must add some free memory methods to support the memory-freeing.
Any comments?

B.R.
Benjamin Wang





-Original Message-
From: Daniel Veillard [mailto:veill...@redhat.com]
Sent: 2012年8月20日 14:25
To: Benjamin Wang (gendwang)
Cc: st...@tvnet.hu; daniel.schwa...@dtnet.de
Subject: Re: Memory free in libvirt JNA

On Mon, Aug 20, 2012 at 05:15:45AM +, Benjamin Wang (gendwang) wrote:
 Hi Veillard,
   Thanks for your reply. I checked the current Libvirt-JNA 
 implementation. I find that a method named free defined in Domain 
 class which is used to free the domain object. If this is mandatory, that's 
 to say, we should a lot of methods into the current Libvirt-jna 
 implementation to free the memory which is allocated by libvirt API. Please 
 correct me!

  As far as I understat free() is aliased as finalize() on that object so the 
java runtime will call free() automatically on garbage collection. I'm not a 
java expert, check some Java litterature for more details about how this is 
done and the cases where


free() might be better called directly.

Daniel

-- 
Daniel Veillard  | libxml Gnome XML XSLT toolkit  http://xmlsoft.org/
dan...@veillard.com  | Rpmfind RPM search engine http://rpmfind.net/ 
http://veillard.com/ | virtualization library  http://libvirt.org/

--
libvir-list mailing list
libvir-list@redhat.com
https

Re: [libvirt] Memory free in libvirt JNA

2012-09-07 Thread Benjamin Wang (gendwang)
Hi,
Overview Part of JNA API describes as following:
1. Description1:
If the native method returns char* and actually allocates memory, a return type 
of Pointer should be used to avoid leaking the memory. It is then up to you to 
take the necessary steps to free the allocated memory.

2. Description2:
Declare the method as returning a Structure of the appropriate type, then 
invoke Structure.toArray(int) to convert to an array of initialized structures 
of the appropriate size. Note that your Structure class must have a no-args 
constructor, and you are responsible for freeing the returned memory if 
applicable in whatever way is appropriate for the called function.

And the example code shows as following:
// Original C code
struct Display* get_displays(int* pcount);
void free_displays(struct Display* displays);

// Equivalent JNA mapping
Display get_displays(IntByReference pcount);
void free_displays(Display[] displays);
...
IntByReference pcount = new IntByReference();
Display d = lib.get_displays(pcount);
Display[] displays = (Display[])d.toArray(pcount.getValue());
...
lib.free_displays(displays);


That's to say. All the memory allocated by native code must be freed explicitly 
in JNA part. We must add some free memory methods to support the memory-freeing.
Any comments?

B.R.
Benjamin Wang





-Original Message-
From: Daniel Veillard [mailto:veill...@redhat.com] 
Sent: 2012年8月20日 14:25
To: Benjamin Wang (gendwang)
Cc: st...@tvnet.hu; daniel.schwa...@dtnet.de
Subject: Re: Memory free in libvirt JNA

On Mon, Aug 20, 2012 at 05:15:45AM +, Benjamin Wang (gendwang) wrote:
 Hi Veillard,
   Thanks for your reply. I checked the current Libvirt-JNA 
 implementation. I find that a method named free defined in Domain 
 class which is used to free the domain object. If this is mandatory, that's 
 to say, we should a lot of methods into the current Libvirt-jna 
 implementation to free the memory which is allocated by libvirt API. Please 
 correct me!

  As far as I understat free() is aliased as finalize() on that object so the 
java runtime will call free() automatically on garbage collection. I'm not a 
java expert, check some Java litterature for more details about how this is 
done and the cases where

free() might be better called directly.

Daniel

-- 
Daniel Veillard  | libxml Gnome XML XSLT toolkit  http://xmlsoft.org/
dan...@veillard.com  | Rpmfind RPM search engine http://rpmfind.net/ 
http://veillard.com/ | virtualization library  http://libvirt.org/

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list

Re: [libvirt] Libvir JNA report SIGSEGV

2012-09-06 Thread Benjamin Wang (gendwang)
Hi,
   The problem only occurs in JNA part. The pure c libvirt works well.  Even If 
I only create a connection outside of the loop, the 
problem can still happen. The following is the easiest problem to reproduce 
this problem

public static void testcase1() throws LibvirtException
{
Connect conn=null;

//connect to the hypervisor
conn = new 
Connect(esx://10.74.125.69:443/?no_verify=1transport=https, new 
ConnectAuthDefault(), 0);

while(true)
{
int[] array = new int[1];

try
{
Thread.sleep(1000);
}
catch(Exception e){}
}
}

B.R.
Benjamin Wang

-Original Message-
From: Daniel Veillard [mailto:veill...@redhat.com] 
Sent: 2012年9月6日 15:49
To: Benjamin Wang (gendwang)
Cc: libvir-list@redhat.com; Yang Zhou (yangzho)
Subject: Re: [libvirt] Libvir JNA report SIGSEGV

On Wed, Sep 05, 2012 at 08:59:07AM +, Benjamin Wang (gendwang) wrote:
 Hi,
   I try to verify the JNA with concurrent situation but meet some problems. 
 The following is my example code:
 public static void testcase1() throws LibvirtException
 {
 Connect conn=null;
 Connect conn1=null;
 
 //connect to the hypervisor
 conn = new 
 Connect(esx://10.74.125.68:443/?no_verify=1transport=https, new 
 ConnectAuthDefault(), 0);
 System.out.println(conn.getVersion());
 
 //connect to the hypervisor
 conn1 = new 
 Connect(esx://10.74.125.90:443/?no_verify=1transport=https, new 
 ConnectAuthDefault(), 0);
 System.out.println(conn1.getVersion());
 
 
 while(true)
 {
 int[] array = new int[1];
 Long version = conn.getVersion();
 Long version1 = conn1.getVersion();
 
 try
 {
  Thread.sleep(1000);
 }
 catch(Exception e)
 {
 }
 }
 }
 
 When I add line int[] array = new int[1], then the following error 
 will be generated very quickly:
 # An unexpected error has been detected by Java Runtime Environment:
 #
 #  SIGSEGV (0xb) at pc=0x003f9b07046e, pid=30049, tid=1109510464 # 
 # Java VM: OpenJDK 64-Bit Server VM (1.6.0-b09 mixed mode linux-amd64) 
 # Problematic frame:
 # C  [libc.so.6+0x7046e]
 #
 # An error report file with more information is saved as:
 
 I have tried to write the similar code as following. It works well.
 static void virXenBasic_TC001(void)
 {
 virConnectPtr conn = NULL;
 virConnectPtr conn1 = NULL;
 unsigned long version = 0;
 unsigned long version1 = 0;
 char *hostname = NULL;
 
 conn = virConnectOpenAuth(esx://10.74.125.21/?no_verify=1, 
 virConnectAuthPtrDefault, 0);
 if (conn == NULL) {
 fprintf(stderr, Failed to open connection to qemu:///system\n);
 return;
 }
 
 conn1 = virConnectOpenAuth(esx://192.168.119.40/?no_verify=1, 
 virConnectAuthPtrDefault, 0);
 if (conn1 == NULL) {
 fprintf(stderr, Failed to open connection to qemu:///system\n);
 return;
 }
 
 while(true)
 {
 hostname = malloc(sizeof(char) * 1);
 virConnectGetVersion(conn, version);
 virConnectGetVersion(conn, version1);
 free(hostname);
 sleep(1);
 }
 return;
 }

Maybe you need to increase the stack or memory size of you java process or 
something, that doesn't look related to libvirt at all in my opinion.
Well maybe the bindings fails somewhere at checking for an allocation error, 
but is it in JNA ?

Daniel

-- 
Daniel Veillard  | libxml Gnome XML XSLT toolkit  http://xmlsoft.org/
dan...@veillard.com  | Rpmfind RPM search engine http://rpmfind.net/ 
http://veillard.com/ | virtualization library  http://libvirt.org/

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list

Re: [libvirt] Libvir JNA report SIGSEGV

2012-09-06 Thread Benjamin Wang (gendwang)
Hi,
  Actually I also did another test as following. When I comment the new 
Connet, the program works well. So this is the problem related to
Libvirt JNA. If I manually run the garbage collection for this program, it 
still works well. But if I run the garbage collection for the last problem,
It will crash. I guess this problem is caused by ConnectAuth callback. When 
garbage collection is executed, the callback memory is moved.

B.R.
Benjamin Wang


 public static void testcase1() throws LibvirtException
 { 
 while(true)
 {
   int[] array = new int[1];
   
   try
   {
   Thread.sleep(1000);
   }
   catch(Exception e){}
 }
 }
-Original Message-
From: Daniel Veillard [mailto:veill...@redhat.com] 
Sent: 2012年9月6日 16:53
To: Benjamin Wang (gendwang)
Cc: libvir-list@redhat.com
Subject: Re: [libvirt] Libvir JNA report SIGSEGV

On Thu, Sep 06, 2012 at 07:53:24AM +, Benjamin Wang (gendwang) wrote:
 Hi,
The problem only occurs in JNA part. The pure c libvirt works well.  
 Even If I only create a connection outside of the loop, the problem 
 can still happen. The following is the easiest problem to reproduce 
 this problem
 
 public static void testcase1() throws LibvirtException
 {
 Connect conn=null;
 
 //connect to the hypervisor
 conn = new 
 Connect(esx://10.74.125.69:443/?no_verify=1transport=https, new 
 ConnectAuthDefault(), 0);
 
 while(true)
 {
   int[] array = new int[1];
   
   try
   {
   Thread.sleep(1000);
   }
   catch(Exception e){}
 }
 }

  Then it's a java bug. The loop doesn't call or use libvirt in any way.
If it crashes in the loop it's java crashing to me !

Daniel

-- 
Daniel Veillard  | libxml Gnome XML XSLT toolkit  http://xmlsoft.org/
dan...@veillard.com  | Rpmfind RPM search engine http://rpmfind.net/ 
http://veillard.com/ | virtualization library  http://libvirt.org/

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list

Re: [libvirt] Libvir JNA report SIGSEGV

2012-09-06 Thread Benjamin Wang (gendwang)
Hi,
   I have looked into the code for several days. But I didn't find the root 
cause. Because even if I only call new Connect, the problem
will occur. So this should be related to Connect.java or 
ConnectAuthDefault.java. Would you take a quick at the issue and give some 
prompt?
Then I can try to fix this. Thanks!

B.R.
Benjamin Wang

-Original Message-
From: Daniel Veillard [mailto:veill...@redhat.com] 
Sent: 2012年9月6日 19:05
To: Benjamin Wang (gendwang)
Cc: libvir-list@redhat.com
Subject: Re: [libvirt] Libvir JNA report SIGSEGV

On Thu, Sep 06, 2012 at 09:06:14AM +, Benjamin Wang (gendwang) wrote:
 Hi,
   Actually I also did another test as following. When I comment the 
 new Connet, the program works well. So this is the problem related 
 to Libvirt JNA. If I manually run the garbage collection for this program, it 
 still works well. But if I run the garbage collection for the last problem, 
 It will crash. I guess this problem is caused by ConnectAuth callback. When 
 garbage collection is executed, the callback memory is moved.

  Okay, maybe some memory need to be pinned in some ways, I take patches !

Daniel

-- 
Daniel Veillard  | libxml Gnome XML XSLT toolkit  http://xmlsoft.org/
dan...@veillard.com  | Rpmfind RPM search engine http://rpmfind.net/ 
http://veillard.com/ | virtualization library  http://libvirt.org/

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list

[libvirt] Libvir JNA report SIGSEGV

2012-09-05 Thread Benjamin Wang (gendwang)
Hi,
  I try to verify the JNA with concurrent situation but meet some problems. The 
following is my example code:
public static void testcase1() throws LibvirtException
{
Connect conn=null;
Connect conn1=null;

//connect to the hypervisor
conn = new 
Connect(esx://10.74.125.68:443/?no_verify=1transport=https, new 
ConnectAuthDefault(), 0);
System.out.println(conn.getVersion());

//connect to the hypervisor
conn1 = new 
Connect(esx://10.74.125.90:443/?no_verify=1transport=https, new 
ConnectAuthDefault(), 0);
System.out.println(conn1.getVersion());


while(true)
{
int[] array = new int[1];
Long version = conn.getVersion();
Long version1 = conn1.getVersion();

try
{
 Thread.sleep(1000);
}
catch(Exception e)
{
}
}
}

When I add line int[] array = new int[1], then the following error 
will be generated very quickly:
# An unexpected error has been detected by Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0x003f9b07046e, pid=30049, tid=1109510464
#
# Java VM: OpenJDK 64-Bit Server VM (1.6.0-b09 mixed mode linux-amd64)
# Problematic frame:
# C  [libc.so.6+0x7046e]
#
# An error report file with more information is saved as:

I have tried to write the similar code as following. It works well.
static void virXenBasic_TC001(void)
{
virConnectPtr conn = NULL;
virConnectPtr conn1 = NULL;
unsigned long version = 0;
unsigned long version1 = 0;
char *hostname = NULL;

conn = virConnectOpenAuth(esx://10.74.125.21/?no_verify=1, 
virConnectAuthPtrDefault, 0);
if (conn == NULL) {
fprintf(stderr, Failed to open connection to qemu:///system\n);
return;
}

conn1 = virConnectOpenAuth(esx://192.168.119.40/?no_verify=1, 
virConnectAuthPtrDefault, 0);
if (conn1 == NULL) {
fprintf(stderr, Failed to open connection to qemu:///system\n);
return;
}

while(true)
{
hostname = malloc(sizeof(char) * 1);
virConnectGetVersion(conn, version);
virConnectGetVersion(conn, version1);
free(hostname);
sleep(1);
}
return;
}

B.R.
Benjamin Wang
--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list

[libvirt] Question about contribution

2012-07-16 Thread Benjamin Wang (gendwang)
Hi,
  I am from Cisco. Now we want to use and contribute to Libvirt. One simple 
question is as following:
If our product needs a new feature for Libvirt, what is the process to submit 
our contribution to Libvirt? Must we
be approved by some committee?

B.R.
Benjamin
--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list