Looking for old news? Jump directly to the news archive!

For some years I have been seeing problems of nodejs based applications that do not work in IPv6 only networks. More recently, I again found a situation in which a nodejs based application does not even install, if you try to install it in an IPv6 only network.

As the situation is not just straight forward, I started to collect information about it on this website.

The starting point

I wanted to install etherpad-lite and it failed with the following error:

174 error request to https://registry.npmjs.org/express-session/-/express-session-1.17.1.tgz failed, reason: connect EHOSTUNREACH 104.16.25.35:443

The message connect EHOSTUNREACH 104.16.25.35:443 already cleary points to the problem: npm is trying to connect via IPv4 on an IPv6 only VM. This cleary cannot work.

A bug in NPM?

My first suspicion was that it must be a bug in npm. But on Twitter I was told that npm should work in IPv6 only networks. That's strange. However it turns out that somebody else had this problem before and it seems to be specific to using npm on Alpine Linux.

A bug in Alpine Linux?

Alpine Linux is currently the main distribution that I use. Not because of the small libc called musl, but because the whole system works straight forward. Correct. And easy to use. But what does that have to do with etherpad-lite failing to install in an IPv6 only network?

It turns out that there is a difference between musl and glibc in the default behaviour of getaddrinfo(), which is used to retrieve DNS results from the operating system.

A bug in musl libc?

I got in touch with the developers of musl and the statement is rather easy: musl is behaving according to the spec and the caller, in this context nodejs, cannot just use the first result, but has to potentially try all results.

A DNS or a design bug?

And at this stage the problem gets tricky. Let's revise again what I wanted to do and why we are so deep into the rabbit hole.

I wanted to install etherpad-lite, which uses resources from registry.npmjs.org. So npm wants to connect via HTTPS to registry.npmjs.org and download a file. To achieve this, npm has to find out which IP address registry.npmjs.org has. And for this it is doing a DNS lookup.

So far, so good. Now the trouble begins:

A DNS lookup can contain 0, 1 or many answers.

And in case of the libc call getaddrinfo, the result is a list of IPv6 and IPv4 addresses, potentially 0 to many of each.

So an application that "just wants to connect somewhere", cannot just take the first result.

A bug in nodejs?

The assumption at this point is that nodejs only takes the first result from DNS and tries to connect to it. However so far I have not been able to spot the exact source code location to support that claim.

Stay tuned...

Posted Sat Jan 23 09:48:31 2021 Tags:

It's time for the 36c3 and to verify that some things are in place where they should be.

As some of you might know, I am using IPv6 extensively to provide services anywhere on anything, so you will see quite some IPv6 related rules in my configuration.

This post should serve two purpose:

  • Inspire others to verify their network settings prior to the congress
  • Get feedback from anyone spotting a huge mistake in my config :-)

The firewall rules

I am using nftables on my notebook and the full ruleset is shown below.

table ip6 filter {
        chain input {
                type filter hook input priority 0;
                policy drop;

                iif lo accept
                ct state established,related accept

                icmpv6 type { destination-unreachable, packet-too-big, time-exceeded, parameter-problem, echo-request, nd-router-advert, nd-neighbor-solicit, nd-neighbor-advert } accept

                tcp dport { 22, 80, 443 } accept

        }

        chain forward {
                type filter hook forward priority 0;
                policy drop;

                ct state established,related accept

                ip6 daddr 2a0a:e5c1:137:b00::/64  jump container
                ip6 daddr 2a0a:e5c1:137:cafe::/64 jump container
        }

        chain container {
        icmpv6 type { destination-unreachable, packet-too-big, time-exceeded, parameter-problem, echo-request, nd-router-advert, nd-neighbor-solicit, nd-neighbor-advert } accept

                tcp dport { 22, 80, 443 } accept
                drop

        }
        chain output {
                type filter hook output priority 0;
                policy accept;
        }
}

table ip filter {
        chain input {
                type filter hook input priority 0;
                policy drop;

                iif lo accept
                ct state established,related accept
                tcp dport { 22 } accept
                tcp dport { 51820 } accept
        }
        chain forward {
                type filter hook forward priority 0;
                policy drop;
        }
        chain output {
                type filter hook output priority 0;
                policy accept;
        }
}

The firewall explained: IPv6

Let's have a look at the IPv6 part first. In nftables we can freely define chains, what is important is is the hook that we use in it.

        chain input {
                type filter hook input priority 0;
                ...

The policy has the same meaning as in iptables and basically specifies what to do with unmatched packets.

IPv6 uses quite some ICMP6 messages to control and also to establish communication in the first place, so the list for accepting is quite long.

                icmpv6 type { destination-unreachable, packet-too-big, time-exceeded, parameter-problem, echo-request, nd-router-advert, nd-neighbor-solicit, nd-neighbor-advert } accept

As we are dealing with traffic that comes to my notebook ("hook input"), I want to allow any incoming packets that belong to one of the connections that I initiated:

                ct state established,related accept

And finally, I allow port 22, to be able to ssh into my notebook, port 80 to get letsencrypt certificates and port 443 for serving https. When I am online, my notebook is reachable at nico.plays.ipv6.games, so I need the web ports to be open.

As I run quite some test on my notebook with docker and lxc, I created a /64 IPv6 network for each of them. When matching on those specific networks, I jump into a chain that allows specific configurations for containers:

                ip6 daddr 2a0a:e5c1:137:b00::/64  jump container
                ip6 daddr 2a0a:e5c1:137:cafe::/64 jump container

The chain container consists at the moment of the same rule set as the input chain, however this changes occasionally when testing applications in containers.

And for the output chain, I trust that the traffic my notebook emits is what I wanted it to emit (but also allows malware to send out data, if I had some installed).

The firewall explained: IPv4

In the IPv4 irea ("table ip filter*) things are quite similar with some small differences:

  • I don't provides services on IPv4 besides ssh and wireguard (port 22 and 51820)
  • There is nothing to be forwarded for IPv4, all containers use IPv6
  • Same logic for the output as in IPv6

Safe or not safe?

Whether this ruleset is safe or not depends a bit on your degree of paranoia. I allow access to port 443 on which an nginx runs which then again proxies to a self written flask application, which might-or-might-not be safe.

Some people argue to limit outgoing traffic and while this is certainly possible (whitelist ports?), often this does is rendered useless, as any command and control server can be reached on port 80 and you probably don't want to block outgoing port 80 traffic.

If you have any comments about it, I'm interested in hearing your feedback on the ungleich chat, twitter or IRC (telmich).

Update 2019-12-24

I forgot to allow loopback traffic in the original version, which breaks some local networking.

Posted Mon Dec 23 18:23:43 2019 Tags:

Introduction

Dealing with a lot of hardware (in the sense of moving/maintaining) involves some support from vendors. Sometimes vendors are doing a particular bad job. This blog page is dedicated to vendor screwups and document real stories.

The support for a Dell XPS 13 2-in-1 (2019-12-19 - 2020-04-06)

10 days after the repair the space bar inhibts the same behaviour and hangs / does not produce a sign. I cannot open a service request on the website, as the previous request is still open. Additionally now the rubber from the screen falls off. The overall impression of the device is like a cheap 50$ notebook that you buy on shady electronics market, significantly below standard of regular notebooks.

Again a list of "mysterious" calls can be seen in the Dell website, but nobody ever responds on the Dell own website. Meanwhile the support on Twitter tells me to take pictures and a video of the space on Twitter. After I sent all of it, I am asked to reboot the system and try if the space bar still doesn't work in the bootloader.

Updates from this particular Dell fiasco:

  • 2019-12-20: The exchange is confirmed., it should be done in 2-3 work days
  • 2019-12-28: The necessary part (unclear which) is not available, the exchange is postponed to 2020-01-12.
  • 2019-12-31: I get an offer to the the system replaced by a refurbished system. That does not make any sense as the device is almost brand new and a refurbished one has been returned (probably for a good reason).
  • 2020-01-03: I request a statement to how Dell understands "Next business day" support, as the problem has been open for weeks. Again.
  • 2020-01-04: another key ("d") now also gets stuck. It seems the keyboard was never tested to be used in reality to me.
  • 2020-01-06: Dell informs me that the repair is delayed until 2020-01-28.
  • 2020-01-07: I inform Dell that their behaviour is breaking the contract and that I want to have a refund, a replacement or a repair by end of week. This is still significantly longer than "next business day".
  • 2020-01-08: Dell re-informs me that they can exchange with a refurbished device, which I decline before.
  • 2020-01-10: Dell re-informs me that the repair is delayed until 2020-01-28
  • 2020-01-16: Dell informs me that the repair is delayed until 2020-02-24. This makes it an at least 2 months repair time.
  • 2020-01-16: I re-inform Dell that they can refund and pickup the device and that I expect it to be done by 2020-01-24
  • 2020-01-21: Instead of replacing the notebook with a used notebook, Dell today suggest to replace it with a new one. I accept the proposal and now wait for a replacement device.
  • 2020-01-22: I am asked to take a picture of the notebook with the serial tag and to provide the following information: 1. Service request number, 2. Registered Owner's name, 3. Current date and time and 4. Current Location. However the device does not have a sticker and all information is already present at Dell.
  • 2020-01-28: While the replacement notebook should be on the way (according to the tracking it isn't - but then again nothing in the support system of Dell is up-to-date), the current notebook is slowly dying: The screen has become wobbly-wobbly and makes funny noise when opening, closing or even moving the notebook while the display is open. If this notebook was about 7 years old, I'd say it's a typical worn-off problem. However it is about 3 months old now. My hope: it's only this particular model, it's not an issue of the whole XPS 13 series. My fear: it actually looks to be designed rather fragile, compared to a thinkpad. Also note: more random keys get stuck half the way and make it impossible to type text correctly, because a key may-or-may not function on the first hit.
  • 2020-01-31: The date for the replacement is set to be 2020-02-14.
  • 2020-01-31: The notebook begins to further fall apart: the keyboard/lower part slowly disconnects from the screen on the right side. This might also explain the wobbly behaviour. Furthermore the notebook freezes now with some disk I/O. The latter could be a software bug, the former could be a mis-repair (screw lose?). So clearly, if you need to rely on a computer, neither the XPS nor Dell is something to choose.
  • 2020-02-02: The audio jack is now loose and headphones only get partial connectivity. This is probably related to the right part of the screen falling off.
  • 2020-02-03: Can't believe it, but now the touchpad also gets stuck. It is about 0.3mm down on the left side, making it impossible to issue a left click.
  • 2020-02-11: The replacement notebook arrived.
  • 2020-02-16: The replacement notebook gets hand-burning hot at the bottom. Problem reported via Twitter.
  • 2020-02-17: Dell says the hot temperatures are normal, even though I advise Dell that it has potential to burn my skin.
  • 2020-02-19: On the replacement notebook the "c" key gets stuck from time to time.
  • 2020-02-25: The power supply of the replacement notebook is broken. It stops charging after some time, the led on the charger turns off. It works again, if it is disconnected from the power outlet for some hours.
  • 2020-02-26: Running diagnosistics confirms that the charger is broken. This amount of time spent for debugging this notebook series is beyond ridiculous. Dell so far refuses a full refund even though they clearly ship unusable hardware.
  • 2020-02-26: The "h" and "u" keys are now als exhibiting partial stuck behaviour.
  • 2020-02-26: The system gets very slow (mouse pointer lagging slow). I reboot. The system gets stuck in the Dell logo state. Turning it off hard. Turning it on again. It stays stuck with the Dell logo.
  • 2020-04-20: The system has been sent back and refunded.

Summary: Dell is fully incapable of repairing a device and upholding a contract. I assumed I bought a notebook with next business day service. What I got is a computer which has frequent hardware failures and no support within any sensible amount of time.

The support for a Dell XPS 13 2-in-1 (2019-11-16 - 2019-12-09)

I ordered this particular notebook on 2019-09-19 and it arrived around 2019-09-27. So far so good. However shortly after starting to use it, I managed to get a somewhat-stuck key (the "p key"), which more-or-less randomly hangs/does not produce a character. As some of my passwords contain a p, this led to very frustrating login failures.

Having a stuck key like this after less than 2 months of use is really not showing good quality, so I reported this issue with Dell on 2019-11-16. With the device I bought the so called "Complete Care Service" and "Premium Support". In theory reachable 24x7.

In practice, after opening the support request on 2019-11-16, I did not receive a real reply on the following Monday. So I reached out again and got a reply on Tuesday, already being late if it was only next business day (NBD) support.

After reporting that issue additionally the rubber below that keeps the notebook stable on the table began to detach itself from the notebook. Only another minor problem, but clearly nothing to expect from a quality device.

After a long forth-and-back via Twitter DM about the device heat and whether the p key still occasionally is stuck (yes!) there was eventually a replacement scheduled for the 26th of November.

However - you can guess it - nobody showed up. The log at Dell says that somebody tried to reach me, however there was no missed calls on any of my numbers. And no email or no direct message. So even if somebody tried to call, they did not bother sending an email.

Until I reached out again, after I got a message that the phone number is forwarded. It continues "funny" like that: on 26th there was no further communication from Dell. No message, no call, no email.

However when logging in to the Dell portal, Dell rescheduled the appointment for Thursday, 28th, 0800.

Independently on how the story evolves from here, the amount of time spent into the support, waiting, replanning locations, etc. is already exceeding the worth of the product. So I can clearly disrecommend buying this device/support combination, if you want to professionally work with it.

And it continued on 2019-11-27 at around 2230 in the evening when the Dell technician called me by accident. "I just wanted to save your number". Then asking me on the phone where Glarus is in detail. I guess Dell doesn't have a navigation software... Then eventually telling me that he might or might not come tomorrow (the 28th), but he will certainly contact me in the morning.

2019-11-28, around 1400. No call, no message, no nothing. Reaching out via Twitter DM. Again. My phone number is confirmed, I get as an answer. So yet another day where Dell scheduled the support (not me), does not appear, does not reach out nor gives any suitable answer.

2019-11-29. The technician just wrote an email that he comes Monday 1200. That is yet another week after Dell originally announced the repair and yet another time that Dell unilaterally decides on a new repair date without even trying to confirm the date.

But it gets worse: later in the evening I received a twitter message that the case is closed. Without ever having seen a technician, without having gotten it repaired. And a bit later it gets confirmed on the "service request" page of Dell.

So in a summary:

  • Waited for nothing to happen for two weeks
  • Multiple support appointments scheduled without ever showing up
  • Claims of trying to reach me by phone without any missed call (and other calls that I received the same day)
  • Wasted many hours in communication
  • No support executed at all
  • Support cased closed without doing anything

The story continues: 2019-12-04. I got a message saying that the techincian is coming tomorrow. Again without confirmation from my side and with less than 24h to react.

2019-12-05: because so far nobody ever showed up, I send a message via Twitter to @DellHilft, asking about the technician. Answer is that I should wait. The third day. I also check the support center, which claims to have called me at 3 am GMT. 3 am. Seriously, which company does that?

GMT is actually behind Swiss time, so the actual call happend around 2am. Besides all of that, I obviously did not receive a call.

BUT things can get worse with Dell. Since the 5th, my messages in the dell support website don't show up anymore. It basically looks as follows:

Dell: we are calling you
Me: I don't see a call, this is my number:
Dell: we are calling you
Me: Hello? Did you see my message?
Dell: We will just silently drop your messages now

Since 2019-12-05 the "." key is also stuck from time to time. Basically the notebook is falling apart within 2 months of use and the only thing you get is false claims of a technician showing up.

2019-12-06: the technician is calling around 0830. He starts by asking where I live and then tells me it is far away and he doesn't have time for me. He has many other customers. He also sounds very drunk. He tells me he might come on Monday, but cannot tell a time yet.

Also on the same day: I get a note from Dell telling me the technician could not reach me. Not sure how many WTFs can be produced within one day, but Dell is really pushing it to the limits.

2018-12-09: the technician called at 0900, arrived by 1230 and fixed the notebook around 1500.

  • Roughly 4 weeks waiting time
  • Roughly 80+ messages exchanged with Dell
  • 4 working days invested to get it fixed
Posted Tue Nov 26 17:41:26 2019 Tags:

Overview

Due to RAM limitations in most notebooks (16G maximum) I have recently switched to the HP X360 1040 G5, more or less the 14" HP equivalent of the Lenovo X1 Carbon. Some tech specs for the geeks under us:

  • Resolution 3840x2160
  • 1 TB SSD / NVMe
  • 32GB RAM

This article is work in progress, currently more to be seen as a todo list for myself.

Alpine

My backup notebooks are currently running Arch Linux and Devuan. As I find Alpine an interesting project (it resembles most of what I think how Linux should be), I thought about giving it a try.

Some things that are a bit special in alpine Linux:

  • Does not come with shadow by default
  • Uses musl libc instead of glibc (yeah!)

Besides that, some things that are instant benefits of Alpine:

  • easy to use package manager
  • easy to write package format
  • VERY fast package installations (because they are fast)
  • The sound is GREAT (especially compared to the X1 Carbon that does not really have speakers)

What is working on alpine + X360 1040

Almost everything. C'mon, it's 2019 and as long as xorg + i3 is running, what is there more that you want? Some things to emphasise of either:

  • The keyboard is quite nice (actually nicer then Gen6 X1 Carbon)
  • You can run startx via ssh and there is no stupid config that stops you from it!
  • Suspend works even with playing sound, just using pm-suspend + acpid
  • beauty!

What is currently not working on alpine + X360 1040

There are a few minor hiccups that I still need to solve in the next days:

  • create a package for mu4e 1.2 (currently installed in /usr/local)
    • needs fix for /usr/bin/sh reference
    • PR created by eu at https://github.com/alpinelinux/aports/pull/7881/files
    • local install: works!
  • -create a package for magit-
    • M-x package-install magit
  • create a package for vym
  • create a package for openconnect
  • create a package for kismet
  • checkout why the shotwell package is broken
  • checkout why the firefox package is broken
  • hotkeys don't send the right key events => might be a kernel issue
  • xrandr does not show screen connected via usb-c (have to test other outputs)
  • automate lid handling in cdist
    • Currently just created /etc/acpi/LID/0000080 with pm-suspend in it => works
  • The device has a very high frequency sound that varies over time
    • Seems to be unrelated to power plugged in or out
    • Seems to be related to the fan: fan on => no audible high frequency sound
    • The sound is louder than music played at "regular" volume
    • The sounds is directly related to screen brigthness: 100% => no sound
    • The lower the brightness, the stronger the sound

What has been fixed

  • xbacklight
    • Need to load / install the intel video driver (modesetting does not work atm)
Posted Tue May 14 19:10:49 2019 Tags:

Here's a short overview about the changes found in version 4.11.1:

* Core: Improve explorer error reporting (Darko Poljak)
* Type __directory: explorer stat: add support for Solaris (Ander Punnar)
* Type __file: explorer stat: add support for Solaris (Ander Punnar)
* Type __ssh_authorized_keys: Remove legacy code (Ander Punnar)
* Explorer disks: Bugfix: do not break config in case of unsupported OS
  which was introduced in 4.11.0, print message to stderr and empty disk list
  to stdout instead (Darko Poljak)

For more information visit the cdist homepage.

Posted Mon Apr 22 21:14:37 2019 Tags:

Here's a short overview about the changes found in version 4.11.0:

* Type __package: Add __package_apk support (Nico Schottelius)
* Type __directory: Add alpine support (Nico Schottelius)
* Type __file: Add alpine support (Nico Schottelius)
* Type __hostname: Add alpine support (Nico Schottelius)
* Type __locale: Add alpine support (Nico Schottelius)
* Type __start_on_boot: Add alpine support (Nico Schottelius)
* Type __timezone: Add alpine support (Nico Schottelius)
* Type __start_on_boot: gentoo: check all runlevels in explorer (Nico Schottelius)
* New type: __package_apk (Nico Schottelius)
* Type __acl: Add support for ACL mask (Dimitrios Apostolou)
* Core: Fix circular dependency for CDIST_ORDER_DEPENDENCY (Darko Poljak)
* Type __acl: Improve the type (Ander Punnar)
* Explorer interfaces: Simplify code, be more compatible (Ander Punnar)
* Explorer disks: Remove assumable default/fallback, for now explicitly support only Linux and BSDs (Ander Punnar, Darko Poljak)

For more information visit the cdist homepage.

Posted Sat Apr 20 17:16:55 2019 Tags:

Here's a short overview about the changes found in version 4.10.11:

* Core: Fix broken quiet mode (Darko Poljak)
* Build: Add version.py into generated raw source archive (Darko Poljak)
* Explorer disks: Fix detecting disks, fix/add support for BSDs (Ander Punnar)
* Type __file: Fix stat explorer for BSDs (Ander Punnar)
* Type __directory: Fix stat explorer for BSDs (Ander Punnar)

For more information visit the cdist homepage.

Posted Sat Apr 13 19:57:39 2019 Tags:

Here's a short overview about the changes found in version 4.10.10:

* New types: __ufw and __ufw_rule (Mark Polyakov)
* Type __link: Add messaging (Ander Punnar)
* Debugging: Rename debug-dump.sh to cdist-dump (Darko Poljak)
* Documentation: Add cdist-dump man page (Darko Poljak)

For more information visit the cdist homepage.

Posted Thu Apr 11 14:49:55 2019 Tags:

Here's a short overview about the changes found in version 4.10.9:

* Type __ssh_authorized_keys: Properly handle multiple --option params (Steven Armstrong)
* Debugging: Add debug dump helper script (Darko Poljak)
* Type __file: Bugfix: fire onchange for present and exists states if no attribute is changed (Darko Poljak)

For more information visit the cdist homepage.

Posted Tue Apr 9 22:49:38 2019 Tags:

Here's a short overview about the changes found in version 4.10.8:

* Type __clean_path: Fix list explorer exit code if path not directory or does not exist (Ander Punnar)
* New type: __check_messages (Ander Punnar)

For more information visit the cdist homepage.

Posted Sat Apr 6 10:55:05 2019 Tags: