Re: [openstack-dev] Recommended ways to find compute node's bandwidth (physical NIC)

2016-06-19 Thread Sudipto Biswas

Thanks Jay for pointing this out.

Adnan,

Yeah - you could use PcP - if that module gets integrated with Nova.

For your benefit, here are some of the metrics that PcP can provide 
related to the physical NIC:


pminfo | grep network.interface  <== Listing the available metrics.
network.interface.collisions
network.interface.mtu
network.interface.speed
network.interface.baudrate
network.interface.duplex
network.interface.up
network.interface.running
network.interface.inet_addr
network.interface.ipv6_addr
network.interface.ipv6_scope
network.interface.hw_addr
network.interface.in.bytes
network.interface.in.packets
network.interface.in.errors
network.interface.in.drops
network.interface.in.fifo
network.interface.in.frame
network.interface.in.compressed
network.interface.in.mcasts
network.interface.out.bytes
network.interface.out.packets
network.interface.out.errors
network.interface.out.drops
network.interface.out.fifo
network.interface.out.carrier
network.interface.out.compressed
network.interface.total.bytes
network.interface.total.packets
network.interface.total.errors
network.interface.total.drops
network.interface.total.mcasts


# pmval network.interface.out.bytes <== A sample metric value.
metric:network.interface.out.bytes
host:  
semantics: cumulative counter (converting to rate)
units: byte (converting to byte / sec)
samples:   all
wlp3s0lo virbr0   
wc0   em1 virbr0-nic
   0.0   0.0 0.0   
0.0   0.0 0.0
  85.96  0.0 0.0   
0.0   0.0 0.0
   0.0   0.0 0.0   
0.0   0.0 0.0
   0.0   0.0 0.0   
0.0   0.0 0.0
1588. 4031. 0.0 
787.6   0.0 0.0
 492.7   0.0 0.0 
155.9   0.0 0.0



As you might observe, there's a plenty for you to draw inferences.

Hope this helps!

Cheers,
Sudipto

On Monday 20 June 2016 09:07 AM, Jay Pipes wrote:

On 06/16/2016 04:16 PM, KHAN, RAO ADNAN wrote:

I am writing a nova filter that will check for the compute node (max,
avg) bandwidth, before instantiating an instance. What are some of the
recommended tools that can provide this info in real time? Does any
openstack component hold this info already?


You may want to chat with Sudipta Biswas (cc'd). The PCP module (yes, 
it's an awkward name for a performance profiling module...) does some 
of what you're looking for and Sudipta has been attempting to 
integrate it with Nova for this express purpose.


Best,
-jay

__ 


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: 
openstack-dev-requ...@lists.openstack.org?subject:unsubscribe

http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev




__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Higgins][Zun] Project roadmap

2016-06-13 Thread Sudipto Biswas



On Monday 13 June 2016 06:57 PM, Flavio Percoco wrote:

On 12/06/16 22:10 +, Hongbin Lu wrote:

Hi team,

During the team meetings these weeks, we collaborated the initial 
project roadmap. I summarized it as below. Please review.


* Implement a common container abstraction for different container 
runtimes. The initial implementation will focus on supporting basic 
container operations (i.e. CRUD).


What COE's are being considered for the first implementation? Just 
docker and kubernetes?


* Focus on non-nested containers use cases (running containers on 
physical hosts), and revisit nested containers use cases (running 
containers on VMs) later.
* Provide two set of APIs to access containers: The Nova APIs and the 
Zun-native APIs. In particular, the Zun-native APIs will expose full 
container capabilities, and Nova APIs will expose capabilities that 
are shared between containers and VMs.


- Is the nova side going to be implemented in the form of a Nova 
driver (like

ironic's?)? What do you mean by APIs here?

- What operations are we expecting this to support (just CRUD 
operations on

containers?)?

I can see this driver being useful for specialized services like Trove 
but I'm
curious/concerned about how this will be used by end users (assuming 
that's the

goal).



* Leverage Neutron (via Kuryr) for container networking.
* Leverage Cinder for container data volume.
* Leverage Glance for storing container images. If necessary, 
contribute to Glance for missing features (i.e. support layer of 
container images).


Are you aware of https://review.openstack.org/#/c/249282/ ?
This support is very minimalistic in nature, since it doesn't do 
anything beyond just storing a docker FS tar ball.
I think it was felt that, further support for docker FS was needed. 
While there were suggestions of private docker registry, having something

in band (w.r.t openstack) maybe desirable.

* Support enforcing multi-tenancy by doing the following:
** Add configurable options for scheduler to enforce neighboring 
containers belonging to the same tenant.

** Support hypervisor-based container runtimes.

The following topics have been discussed, but the team cannot reach 
consensus on including them into the short-term project scope. We 
skipped them for now and might revisit them later.

* Support proxying API calls to COEs.


Any link to what this proxy will do and what service it'll talk to? I'd
generally advice against having proxy calls in services. We've just 
done work in

Nova to deprecate the Nova Image proxy.

* Advanced container operations (i.e. keep container alive, load 
balancer setup, rolling upgrade).

* Nested containers use cases (i.e. provision container hosts).
* Container composition (i.e. support docker-compose like DSL).

NOTE: I might forgot and mis-understood something. Please feel free 
to point out if anything is wrong or missing.


It sounds you've got more than enough to work on for now, I think it's 
fine to

table these topics for now.

just my $0.02
Flavio



__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [dragonflow] Low OVS version for Ubuntu

2015-09-17 Thread Sudipto Biswas



On Thursday 17 September 2015 12:22 PM, Li Ma wrote:

Hi all,

I tried to run devstack to deploy dragonflow, but I failed with lower
OVS version.

I used Ubuntu 14.10 server, but the official package of OVS is 2.1.3
which is much lower than the required version 2.3.1+?

So, can anyone provide a Ubuntu repository that contains the correct
OVS packages?


Why don't you just build the OVS you want from here: 
http://openvswitch.org/download/


Thanks,



__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [nova] Regarding NUMA Topology filtering logic.

2015-09-16 Thread Sudipto Biswas

Hi,

Currently the numa_topology filter code in openstack is going by
a decision of filtering out NUMA nodes based on the length of the cpusets
on the NUMA node of a host[1]. For example: if a VM with 8 VCPUs is 
requested, we seem
to be doing len(cputset_on_the_numa_node) should be greater than or 
equal to 8.


IMHO, the logic can be further improved if we start taking the threads 
and cores
into consideration instead of directly going by the cpuset length of the 
NUMA node.


This thought is derived from an architecture like ppc where each core 
can have 8 threads.
However in this case, libvirt reports only 1 thread out of the 8 (called 
the primary
thread). The host scheduling of the guests happen at the core level(as 
only primary

thread is online). The kvm scheduler exploits as many threads of the core
as needed by guest.

Consider an example for the ppc architecture.
In a given NUMA node 0 - with 40 threads - the following
cpusets would be reported by libvirt: 0, 8, 16, 24, 32. The length of 
the cpusets
would suggest that only 5 pcpus are available for pinning, however we 
could potentially

have 40 threads available (cores * threads).

This way we could at least solve the problem that arises if we just take 
the length of the

cpusets into consideration.

Thoughts?

[1] https://github.com/openstack/nova/blob/master/nova/virt/hardware.py#L772

Thanks,
Sudipto


__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Neutron] Regarding neutron bug # 1432582

2015-04-13 Thread Sudipto Biswas

Thanks, I have got a patchset out for review.
I have removed the exception that was being thrown back to the agent and 
have reduced the fix to just logging a meaningful message in the neutron 
server logs.

Appreciate your comments on the same.

Thanks,
Sudipto
On Monday 13 April 2015 11:56 AM, Kevin Benton wrote:
I would like to see some form of this merged at least as an error 
message. If a server has a bad CMOS battery and suffers a power 
outage, it's clock could easily be several years behind. In that 
scenario, the NTP daemon could refuse to sync due to a sanity check.


On Wed, Apr 8, 2015 at 10:46 AM, Sudipto Biswas 
sbisw...@linux.vnet.ibm.com mailto:sbisw...@linux.vnet.ibm.com wrote:


Hi Guys, I'd really appreciate your feedback on this.

Thanks,
Sudipto


On Monday 30 March 2015 12:11 PM, Sudipto Biswas wrote:

Someone from my team had installed the OS on baremetal with a
wrong 'date'
When this node was added to the Openstack controller, the logs
from the
neutron-agent on the compute node showed - AMQP connected.
But the neutron
agent-list command would not list this agent at all.

I could figure out the problem when the neutron-server debug
logs were enabled
and it vaguely pointed at the rejection of AMQP connections
due to a timestamp
miss match. The neutron-server was treating these requests as
stale due to the
timestamp of the node being behind the neutron-server.
However, there's no
good way to detect this if the agent runs on a node which is
ahead of time.

I recently raised a bug here:
https://bugs.launchpad.net/neutron/+bug/1432582

And tried to resolve this with the review:
https://review.openstack.org/#/c/165539/

It went through quite a few +2s after 15 odd patch sets but we
still are not
in common ground w.r.t addressing this situation.

My fix tries to log better and throw up an exception to the
neutron agent on
FIRST time boot of the agent for better detection of the problem.

I would like to get your thoughts on this fix. Whether this
seems legit to have
the fix per the patch OR could you suggest a approach to
tackle this OR suggest
just abandoning the change.




__

OpenStack Development Mailing List (not for usage questions)
Unsubscribe:
openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev





__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe:
openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev




--
Kevin Benton


__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Neutron] Regarding neutron bug # 1432582

2015-04-08 Thread Sudipto Biswas

Hi Guys, I'd really appreciate your feedback on this.

Thanks,
Sudipto

On Monday 30 March 2015 12:11 PM, Sudipto Biswas wrote:
Someone from my team had installed the OS on baremetal with a wrong 
'date'

When this node was added to the Openstack controller, the logs from the
neutron-agent on the compute node showed - AMQP connected. But the 
neutron

agent-list command would not list this agent at all.

I could figure out the problem when the neutron-server debug logs were 
enabled
and it vaguely pointed at the rejection of AMQP connections due to a 
timestamp
miss match. The neutron-server was treating these requests as stale 
due to the
timestamp of the node being behind the neutron-server. However, 
there's no
good way to detect this if the agent runs on a node which is ahead of 
time.


I recently raised a bug here: 
https://bugs.launchpad.net/neutron/+bug/1432582


And tried to resolve this with the review:
https://review.openstack.org/#/c/165539/

It went through quite a few +2s after 15 odd patch sets but we still 
are not

in common ground w.r.t addressing this situation.

My fix tries to log better and throw up an exception to the neutron 
agent on

FIRST time boot of the agent for better detection of the problem.

I would like to get your thoughts on this fix. Whether this seems 
legit to have
the fix per the patch OR could you suggest a approach to tackle this 
OR suggest

just abandoning the change.



__ 


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: 
openstack-dev-requ...@lists.openstack.org?subject:unsubscribe

http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev






__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [Neutron] Regarding neutron bug # 1432582

2015-03-30 Thread Sudipto Biswas

Someone from my team had installed the OS on baremetal with a wrong 'date'
When this node was added to the Openstack controller, the logs from the
neutron-agent on the compute node showed - AMQP connected. But the neutron
agent-list command would not list this agent at all.

I could figure out the problem when the neutron-server debug logs were 
enabled
and it vaguely pointed at the rejection of AMQP connections due to a 
timestamp
miss match. The neutron-server was treating these requests as stale due 
to the

timestamp of the node being behind the neutron-server. However, there's no
good way to detect this if the agent runs on a node which is ahead of time.

I recently raised a bug here: 
https://bugs.launchpad.net/neutron/+bug/1432582


And tried to resolve this with the review:
https://review.openstack.org/#/c/165539/

It went through quite a few +2s after 15 odd patch sets but we still are not
in common ground w.r.t addressing this situation.

My fix tries to log better and throw up an exception to the neutron agent on
FIRST time boot of the agent for better detection of the problem.

I would like to get your thoughts on this fix. Whether this seems legit 
to have
the fix per the patch OR could you suggest a approach to tackle this OR 
suggest

just abandoning the change.



__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev