Re: Issue with GRE Tunnel between a VM and outside server

2024-01-20 Thread Fariborz Navidan
Hi Wei,

I highly appreciate your help. Your proposed solution worked for me.

Thanks.

On Sat, 20 Jan 2024, 16:36 Wei ZHOU,  wrote:

> Have you retried after adding security group rule with protocol number =
> 47?
>
> -Wei
>
>
> 在 2024年1月20日星期六,Fariborz Navidan  写道:
>
> > Hi Daan,
> >
> > We still couldn't sort out this issue with our client VM. We are still
> > waiting for the community to direct us toward finding a solution.
> >
> > Regards.
> >
> > On Fri, 19 Jan 2024, 17:31 Daan Hoogland, 
> wrote:
> >
> > > Friborz, any progress?
> > > not a gre expert but glad to see you get on with your problem.
> > >
> > > On Sat, Jan 6, 2024 at 10:39 PM Fariborz Navidan <
> mdvlinqu...@gmail.com>
> > > wrote:
> > > >
> > > > Hi Dear Experts,
> > > >
> > > > We are running Cs 4.15.0.0 with 2 KVM hosts having security groups
> > > enabled
> > > > zone. We have a VM which a GRE tunnel has been setup between it and a
> > > > server outside our network. Both hosts had been rebooted a few days
> ago
> > > due
> > > > to power interruption. Before the reboot happens, the GRE tunnel was
> > > > working properly on the mentioned VM. However after the reboot, GRE
> > > tunnel
> > > > can be established but machines cannot reach each other via the
> > tunnel's
> > > > private IP address. All ports and protocols are already added to
> > ingress
> > > > rule set of security group which VM belongs to.
> > > >
> > > > Below is output of "ip a" and "ip r" commands on the VM running on
> our
> > CS
> > > > infrastructure.
> > > >
> > > > root@cdn-fr-1-kajgana-net:~# ip a
> > > > 1: lo:  mtu 65536 qdisc noqueue state UNKNOWN
> > group
> > > > default qlen 1000
> > > > link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
> > > > inet 127.0.0.1/8 scope host lo
> > > >valid_lft forever preferred_lft forever
> > > > inet6 ::1/128 scope host
> > > >valid_lft forever preferred_lft forever
> > > > 2: ens3:  mtu 1500 qdisc pfifo_fast
> > > state
> > > > UP group default qlen 1000
> > > > link/ether 1e:00:85:00:02:4d brd ff:ff:ff:ff:ff:ff
> > > > inet 164.132.223.34/28 brd 164.132.223.47 scope global ens3
> > > >valid_lft forever preferred_lft forever
> > > > inet6 fe80::1c00:85ff:fe00:24d/64 scope link
> > > >valid_lft forever preferred_lft forever
> > > > 3: gre0@NONE:  mtu 1476 qdisc noop state DOWN group default
> > qlen
> > > 1000
> > > > link/gre 0.0.0.0 brd 0.0.0.0
> > > > 4: gretap0@NONE:  mtu 1462 qdisc noop state
> DOWN
> > > group
> > > > default qlen 1000
> > > > link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff
> > > > 5: erspan0@NONE:  mtu 1450 qdisc noop state
> DOWN
> > > group
> > > > default qlen 1000
> > > > link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff
> > > > 6: gre1@NONE:  mtu 1476 qdisc noqueue
> > > state
> > > > UNKNOWN group default qlen 1000
> > > > link/gre 164.132.223.34 peer 89.205.123.34
> > > > inet 192.168.169.1/30 scope global gre1
> > > >valid_lft forever preferred_lft forever
> > > > inet6 fe80::200:5efe:a484:df22/64 scope link
> > > >valid_lft forever preferred_lft forever
> > > >
> > > > root@cdn-fr-1-kajgana-net:~# ip r
> > > > default via 164.132.223.46 dev ens3
> > > > 164.132.223.32/28 dev ens3 proto kernel scope link src
> 164.132.223.34
> > > > 192.168.169.0/30 dev gre1 proto kernel scope link src 192.168.169.1
> > > >
> > > > IP address of tunnel's other endpoint is 192.168.169.2 which is
> > > unreachable
> > > > from the VM. It looks like GRE tunnel has been established but
> traffic
> > > > cannot be p[assed through.
> > > >
> > > > Is there something we need to do with iptables rules on the hosts to
> > > allow
> > > > GRE traffic or is there anything else we can do to address this
> issue?
> > > >
> > > > Thanks in advance.
> > > > Regards.
> > >
> > >
> > >
> > > --
> > > Daan
> > >
> >
>


Re: Issue with GRE Tunnel between a VM and outside server

2024-01-20 Thread Wei ZHOU
Have you retried after adding security group rule with protocol number = 47?

-Wei


在 2024年1月20日星期六,Fariborz Navidan  写道:

> Hi Daan,
>
> We still couldn't sort out this issue with our client VM. We are still
> waiting for the community to direct us toward finding a solution.
>
> Regards.
>
> On Fri, 19 Jan 2024, 17:31 Daan Hoogland,  wrote:
>
> > Friborz, any progress?
> > not a gre expert but glad to see you get on with your problem.
> >
> > On Sat, Jan 6, 2024 at 10:39 PM Fariborz Navidan 
> > wrote:
> > >
> > > Hi Dear Experts,
> > >
> > > We are running Cs 4.15.0.0 with 2 KVM hosts having security groups
> > enabled
> > > zone. We have a VM which a GRE tunnel has been setup between it and a
> > > server outside our network. Both hosts had been rebooted a few days ago
> > due
> > > to power interruption. Before the reboot happens, the GRE tunnel was
> > > working properly on the mentioned VM. However after the reboot, GRE
> > tunnel
> > > can be established but machines cannot reach each other via the
> tunnel's
> > > private IP address. All ports and protocols are already added to
> ingress
> > > rule set of security group which VM belongs to.
> > >
> > > Below is output of "ip a" and "ip r" commands on the VM running on our
> CS
> > > infrastructure.
> > >
> > > root@cdn-fr-1-kajgana-net:~# ip a
> > > 1: lo:  mtu 65536 qdisc noqueue state UNKNOWN
> group
> > > default qlen 1000
> > > link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
> > > inet 127.0.0.1/8 scope host lo
> > >valid_lft forever preferred_lft forever
> > > inet6 ::1/128 scope host
> > >valid_lft forever preferred_lft forever
> > > 2: ens3:  mtu 1500 qdisc pfifo_fast
> > state
> > > UP group default qlen 1000
> > > link/ether 1e:00:85:00:02:4d brd ff:ff:ff:ff:ff:ff
> > > inet 164.132.223.34/28 brd 164.132.223.47 scope global ens3
> > >valid_lft forever preferred_lft forever
> > > inet6 fe80::1c00:85ff:fe00:24d/64 scope link
> > >valid_lft forever preferred_lft forever
> > > 3: gre0@NONE:  mtu 1476 qdisc noop state DOWN group default
> qlen
> > 1000
> > > link/gre 0.0.0.0 brd 0.0.0.0
> > > 4: gretap0@NONE:  mtu 1462 qdisc noop state DOWN
> > group
> > > default qlen 1000
> > > link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff
> > > 5: erspan0@NONE:  mtu 1450 qdisc noop state DOWN
> > group
> > > default qlen 1000
> > > link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff
> > > 6: gre1@NONE:  mtu 1476 qdisc noqueue
> > state
> > > UNKNOWN group default qlen 1000
> > > link/gre 164.132.223.34 peer 89.205.123.34
> > > inet 192.168.169.1/30 scope global gre1
> > >valid_lft forever preferred_lft forever
> > > inet6 fe80::200:5efe:a484:df22/64 scope link
> > >valid_lft forever preferred_lft forever
> > >
> > > root@cdn-fr-1-kajgana-net:~# ip r
> > > default via 164.132.223.46 dev ens3
> > > 164.132.223.32/28 dev ens3 proto kernel scope link src 164.132.223.34
> > > 192.168.169.0/30 dev gre1 proto kernel scope link src 192.168.169.1
> > >
> > > IP address of tunnel's other endpoint is 192.168.169.2 which is
> > unreachable
> > > from the VM. It looks like GRE tunnel has been established but traffic
> > > cannot be p[assed through.
> > >
> > > Is there something we need to do with iptables rules on the hosts to
> > allow
> > > GRE traffic or is there anything else we can do to address this issue?
> > >
> > > Thanks in advance.
> > > Regards.
> >
> >
> >
> > --
> > Daan
> >
>


Re: Issue with GRE Tunnel between a VM and outside server

2024-01-20 Thread Fariborz Navidan
Hi Daan,

We still couldn't sort out this issue with our client VM. We are still
waiting for the community to direct us toward finding a solution.

Regards.

On Fri, 19 Jan 2024, 17:31 Daan Hoogland,  wrote:

> Friborz, any progress?
> not a gre expert but glad to see you get on with your problem.
>
> On Sat, Jan 6, 2024 at 10:39 PM Fariborz Navidan 
> wrote:
> >
> > Hi Dear Experts,
> >
> > We are running Cs 4.15.0.0 with 2 KVM hosts having security groups
> enabled
> > zone. We have a VM which a GRE tunnel has been setup between it and a
> > server outside our network. Both hosts had been rebooted a few days ago
> due
> > to power interruption. Before the reboot happens, the GRE tunnel was
> > working properly on the mentioned VM. However after the reboot, GRE
> tunnel
> > can be established but machines cannot reach each other via the tunnel's
> > private IP address. All ports and protocols are already added to ingress
> > rule set of security group which VM belongs to.
> >
> > Below is output of "ip a" and "ip r" commands on the VM running on our CS
> > infrastructure.
> >
> > root@cdn-fr-1-kajgana-net:~# ip a
> > 1: lo:  mtu 65536 qdisc noqueue state UNKNOWN group
> > default qlen 1000
> > link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
> > inet 127.0.0.1/8 scope host lo
> >valid_lft forever preferred_lft forever
> > inet6 ::1/128 scope host
> >valid_lft forever preferred_lft forever
> > 2: ens3:  mtu 1500 qdisc pfifo_fast
> state
> > UP group default qlen 1000
> > link/ether 1e:00:85:00:02:4d brd ff:ff:ff:ff:ff:ff
> > inet 164.132.223.34/28 brd 164.132.223.47 scope global ens3
> >valid_lft forever preferred_lft forever
> > inet6 fe80::1c00:85ff:fe00:24d/64 scope link
> >valid_lft forever preferred_lft forever
> > 3: gre0@NONE:  mtu 1476 qdisc noop state DOWN group default qlen
> 1000
> > link/gre 0.0.0.0 brd 0.0.0.0
> > 4: gretap0@NONE:  mtu 1462 qdisc noop state DOWN
> group
> > default qlen 1000
> > link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff
> > 5: erspan0@NONE:  mtu 1450 qdisc noop state DOWN
> group
> > default qlen 1000
> > link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff
> > 6: gre1@NONE:  mtu 1476 qdisc noqueue
> state
> > UNKNOWN group default qlen 1000
> > link/gre 164.132.223.34 peer 89.205.123.34
> > inet 192.168.169.1/30 scope global gre1
> >valid_lft forever preferred_lft forever
> > inet6 fe80::200:5efe:a484:df22/64 scope link
> >valid_lft forever preferred_lft forever
> >
> > root@cdn-fr-1-kajgana-net:~# ip r
> > default via 164.132.223.46 dev ens3
> > 164.132.223.32/28 dev ens3 proto kernel scope link src 164.132.223.34
> > 192.168.169.0/30 dev gre1 proto kernel scope link src 192.168.169.1
> >
> > IP address of tunnel's other endpoint is 192.168.169.2 which is
> unreachable
> > from the VM. It looks like GRE tunnel has been established but traffic
> > cannot be p[assed through.
> >
> > Is there something we need to do with iptables rules on the hosts to
> allow
> > GRE traffic or is there anything else we can do to address this issue?
> >
> > Thanks in advance.
> > Regards.
>
>
>
> --
> Daan
>


Re: Issues migrating primary storage

2024-01-20 Thread Jeremy Hansen
I’m trying to put my NFS primary storage in to maintenance mode, which I 
believe is supposed to migrate all of its storage, correct? The problem is I 
don’t know how to get a status on this job? I can’t really tell if it’s 
working. Management server doesn’t really have anything in the logs…. I don’t 
see any new images or images growing on the Ceph side. So I just don’t know if 
it’s working or how far along the migration is.

-jeremy

> On Friday, Jan 19, 2024 at 12:34 AM, Jeremy Hansen  (mailto:jer...@skidrow.la)> wrote:
> I’m still having issues. Is it unreasonable to migrate 1TB images over a 10G 
> network? Any other ideas of things to try would be much appreciated.
>
> -jeremy
>
>
>
> > On Wednesday, Jan 17, 2024 at 12:49 PM, Jeremy Hansen  > (mailto:jer...@skidrow.la)> wrote:
> > Extending these timeouts in the “wait” configs seems to have helped. One of 
> > my 1TB volumes is finally migrating.
> >
> > WHat’s I’ve noticed is if I allocate a new 1TB volume, I can migrate this 
> > between NFS and Ceph and it takes only about a 1 minute. I assume this is 
> > because it’s “thin provisioned” and there’s no actual data on the volume.
> >
> > But these other volumes I’m trying to move are also “thin provisioned” but 
> > they’re a part of a LVM group. Does making a thin provisioned device part 
> > of a LVM group defeat the thin provisioning? I know these volumes weren’t 
> > full, but I thought perhaps there’s a chance that since it’s a pv in a LVM 
> > config, that maybe that defeats the thin provisioning and it counts it as a 
> > full 1TB volume? I’m just spitballing but I’m trying to understand how this 
> > works so we can do the right thing when provisioning additional volumes.
> >
> > Also, the behavior I’m seeing is it takes a very long time before I see the 
> > block image show up on the Ceph side. Perhaps it preallocated a image 
> > before copying the data? But it seemed strange that I wouldn’t immidiately 
> > see the image appear on the Ceph side after initiating a migration. It’s 
> > hard to see what’s actually going on from the logs and the interface.
> >
> > Thanks
> > -jeremy
> >
> >
> >
> > > On Tuesday, Jan 16, 2024 at 11:29 PM, Jeremy Hansen  > > (mailto:jer...@skidrow.la)> wrote:
> > > I changed copy.volume.wait to 72000
> > >
> > > But I just noticed:
> > >
> > > kvm.storage.online.migration.wait and kvm.storage.offline.migration.wait. 
> > > Worth changing this?
> > >
> > > Thanks
> > > -jeremy
> > >
> > >
> > > > On Tuesday, Jan 16, 2024 at 11:01 PM, Jithin Raju 
> > > > mailto:jithin.r...@shapeblue.com)> wrote:
> > > > Hi Jeremy,
> > > >
> > > > Have you checked the ‘wait’ parameter? Used as wait * 2 timeout.
> > > >
> > > > -Jithin
> > > >
> > > > From: Jeremy Hansen 
> > > > Date: Wednesday, 17 January 2024 at 12:14 PM
> > > > To: users@cloudstack.apache.org 
> > > > Subject: Re: Issues migrating primary storage
> > > > Unfortunately the upgrade didn’t help:
> > > >
> > > > Resource [StoragePool:3] is unreachable: Volume 
> > > > [{"name”:”bigdisk","uuid":"8f24b8a6-229a-4311-9ddc-d6c6acb89aca"}] 
> > > > migration failed due to 
> > > > [com.cloud.utils.exception.CloudRuntimeException: Failed to copy 
> > > > /mnt/11cd19d0-f207-3d01-880f-8d01d4b15020/8f24b8a6-229a-4311-9ddc-d6c6acb89aca
> > > >  to 5837f4e6-9307-43a9-a50c-8c9c885f25e8.qcow2].
> > > >
> > > >
> > > >
> > > > Anything else I can try? I’m trying to move away from NFS completely.
> > > >
> > > > -jeremy
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > On Tuesday, Jan 16, 2024 at 7:06 AM, Suresh Kumar Anaparti 
> > > > mailto:sureshkumar.anapa...@gmail.com>> 
> > > > wrote:
> > > > Hi Jeremy,
> > > >
> > > > Can you extend with the config 'migratewait' and check.
> > > >
> > > > Regards,
> > > > Suresh
> > > >
> > > > On Tue, Jan 16, 2024 at 1:45 PM Jeremy Hansen 
> > > > 
> > > > wrote:
> > > >
> > > >
> > > > I have some large volumes I’m trying to migrate from NFS to Ceph/RBD. 
> > > > 1TB
> > > > volumes. These inevitably times out. I extended these configs:
> > > >
> > > > copy.volume.wait=72000
> > > > job.cancel.threshold.minutes=480
> > > > job.expire.minutes=1440
> > > >
> > > > This helped with smaller volumes but large once still eventually fail.
> > > >
> > > > 2024-01-16 07:50:25,929 DEBUG [c.c.a.t.Request]
> > > > (AgentManager-Handler-8:null) (logid:) Seq 1-5583619113009291196:
> > > > Processing: { Ans: , MgmtId: 20558852646968, via: 1, Ver: v1, Flags: 10,
> > > > [{"org.apache.cloudstack.storage.command.CopyCmdAnswer":{"result":"false","details":"com.cloud.utils.exception.CloudRuntimeException:
> > > > Failed to copy
> > > > /mnt/11cd19d0-f207-3d01-880f-8d01d4b15020/861a6692-e746-4401-9cda-bd791b7d3b5e
> > > > to
> > > > b7acadc8-34a1-4d7a-8040-26368dafc21d.qcow2","wait":"0","bypassHostMaintenance":"false"}}]
> > > > }
> > > >
> > > > 2024-01-16 07:50:26,698 DEBUG [c.c.s.VolumeApiServiceImpl]
> > > > (Work-Job-Executor-41:ctx-e5baf6dc job-1175/job-1176 ctx-bc7b188b)
> > > >