Task 9

2024-12-20

1 Setup

1.1 Libraries

library(httr)
library(xml2)
library(magrittr)
library(dplyr)
library(purrr)
library(stringr)
library(knitr)
library(cli)
library(bit64)

1.2 Retrieve Data from `AoC`

session_cookie <- set_cookies(session = keyring::key_get("AoC-GitHub-Cookie"))
base_url <- paste0("https://adventofcode.com/2024/day/", params$task_nr)
puzzle <- GET(base_url,
              session_cookie) %>% 
  content(encoding = "UTF-8") %>% 
  xml_find_all("///article") %>% 
  lapply(as.character)

parse_puzzle_data <- function(text_block = readClipboard()) {
  text_block %>% 
    unlist() %>% 
    str_remove_all("\n") %>% 
    str_split("") %>% 
    extract2(1L) %>% 
    as.integer()
}

puzzle_data <- local({
  GET(paste0(base_url, "/input"),
      session_cookie) %>% 
    content(encoding = "UTF-8") %>% 
    parse_puzzle_data()
})

2 Puzzle Day 9

2.1 Part 1

2.1.1 Description

— Day 9: Disk Fragmenter —

Another push of the button leaves you in the familiar hallways of some friendly amphipods! Good thing you each somehow got your own personal mini submarine. The Historians jet away in search of the Chief, mostly by driving directly into walls.

While The Historians quickly figure out how to pilot these things, you notice an amphipod in the corner struggling with his computer. He’s trying to make more contiguous free space by compacting all of the files, but his program isn’t working; you offer to help.

He shows you the disk map (your puzzle input) he’s already generated. For example:

2333133121414131402

The disk map uses a dense format to represent the layout of files and free space on the disk. The digits alternate between indicating the length of a file and the length of free space.

So, a disk map like 12345 would represent a one-block file, two blocks of free space, a three-block file, four blocks of free space, and then a five-block file. A disk map like 90909 would represent three nine-block files in a row (with no free space between them).

Each file on disk also has an ID number based on the order of the files as they appear before they are rearranged, starting with ID 0. So, the disk map 12345 has three files: a one-block file with ID 0, a three-block file with ID 1, and a five-block file with ID 2. Using one character for each block where digits are the file ID and . is free space, the disk map 12345 represents these individual blocks:

0..111....22222

The first example above, 2333133121414131402, represents these individual blocks:

00...111...2...333.44.5555.6666.777.888899

The amphipod would like to move file blocks one at a time from the end of the disk to the leftmost free space block (until there are no gaps remaining between file blocks). For the disk map 12345, the process looks like this:

0..111....22222
02.111....2222.
022111....222..
0221112...22...
02211122..2....
022111222......

The first example requires a few more steps:

00...111...2...333.44.5555.6666.777.888899
009..111...2...333.44.5555.6666.777.88889.
0099.111...2...333.44.5555.6666.777.8888..
00998111...2...333.44.5555.6666.777.888...
009981118..2...333.44.5555.6666.777.88....
0099811188.2...333.44.5555.6666.777.8.....
009981118882...333.44.5555.6666.777.......
0099811188827..333.44.5555.6666.77........
00998111888277.333.44.5555.6666.7.........
009981118882777333.44.5555.6666...........
009981118882777333644.5555.666............
00998111888277733364465555.66.............
0099811188827773336446555566..............

The final step of this file-compacting process is to update the filesystem checksum. To calculate the checksum, add up the result of multiplying each of these blocks’ position with the file ID number it contains. The leftmost block is in position 0. If a block contains free space, skip it instead.

Continuing the first example, the first few blocks’ position multiplied by its file ID number are 0 * 0 = 0, 1 * 0 = 0, 2 * 9 = 18, 3 * 9 = 27, 4 * 8 = 32, and so on. In this example, the checksum is the sum of these, 1928.

Compact the amphipod’s hard drive using the process he requested. What is the resulting filesystem checksum? (Be careful copy/pasting the input for this puzzle; it is a single, very long line.)

2.1.2 Solution

The idea is as follows:

We know that in the end there is a consecutive block with files, followed by a block with empty space. The length of the files block equals the sum of all file blocks.
Thus, we first generate the position of the file blocks in the original disk image.
Then, we compare the original file block positions with the file block size:
If the position is smaller than the final block size, the block keeps its index.
If it is greater than the final block size, we assign empty block indices to it. We follow the rule that we assign the smallest indices to the latest files.
Eventually, we have the final block position together with the file id and can multiply those numbers.

get_layout <- function(disk_map) {
  if (length(disk_map) %% 2 == 0) {
    fill_me <- NULL
  } else {
    fill_me <- NA_integer_
  }
  tibble(
    files = disk_map[seq(1L, length(disk_map), 2L)],
    free_space = c(disk_map[seq(2L, length(disk_map), 2L)], fill_me)
  ) %>% 
    mutate(file_id = 0:(n() - 1L), .before = 1L) %>% 
    mutate(start = lag(cumsum(files + free_space), default = 0L),
           indices = map2(start, files, ~ seq(.x, length.out = .y)))
}

checksum <- function(disk_map) {
  layout <- get_layout(disk_map) 
  indices <- layout %>% 
    reframe(file_id = rep(file_id, lengths(indices)),
            original_index = unlist(indices),
            has_space = original_index <= sum(lengths(indices)))
  empty_indices <- seq(0, max(indices %>% pull(original_index))) %>% 
    setdiff(indices %>% pull(original_index)) %>% 
    extract(seq(1, sum(!(indices %>% pull(has_space))))) %>% 
    rev()
  indices %>% 
    mutate(final_index = c(original_index[has_space],
                           empty_indices)) %>% 
    summarize(chk_sum = sum(as.integer64(file_id) *
                              as.integer64(final_index))) %>% 
    pull(chk_sum)
}

checksum(puzzle_data)

## integer64
## [1] 6398608069280

2.2 Part 2

2.2.1 Description

— Part Two —

Upon completion, two things immediately become clear. First, the disk definitely has a lot more contiguous free space, just like the amphipod hoped. Second, the computer is running much more slowly! Maybe introducing all of that file system fragmentation was a bad idea?

The eager amphipod already has a new plan: rather than move individual blocks, he’d like to try compacting the files on his disk by moving whole files instead.

This time, attempt to move whole files to the leftmost span of free space blocks that could fit the file. Attempt to move each file exactly once in order of decreasing file ID number starting with the file with the highest file ID number. If there is no span of free space to the left of a file that is large enough to fit the file, the file does not move.

The first example from above now proceeds differently:

00...111...2...333.44.5555.6666.777.888899
0099.111...2...333.44.5555.6666.777.8888..
0099.1117772...333.44.5555.6666.....8888..
0099.111777244.333....5555.6666.....8888..
00992111777.44.333....5555.6666.....8888..

The process of updating the filesystem checksum is the same; now, this example’s checksum would be 2858.

Start over, now compacting the amphipod’s hard drive using this new method instead. What is the resulting filesystem checksum?

2.2.2 Solution

This time we use an iterative process, where we try to move each file to the left, if there are some free indices which can house the file blocks.

checksum_defragmented <- function(disk_map) {
  layout <- get_layout(disk_map) %>% 
    mutate(free_indices = map2(start + files, 
                               free_space,
                               ~ seq(.x, length.out = coalesce(.y, 0L))),
           final_indices = rep(list(integer64(0L)), n()))
  for (row_index in seq(nrow(layout), 1L)) {
    record <- layout %>% 
      slice(row_index)
    size_needed <- record %>% 
      pull(files)
    size_available <- layout %>% 
      pull(free_space)
    slot_indices <- size_available >= size_needed & 
      seq_along(size_available) < row_index
    if (any(slot_indices)) {
      slot_index <- which.max(slot_indices)
      slot <- layout %>% 
        slice(slot_index)
      avail_indices <- slot %>% 
        pull(free_indices) %>% 
        unlist()
      used_indices <- head(avail_indices, size_needed)
      remaining_indices <- setdiff(avail_indices, used_indices)
      ## store the found indices for the current record
      layout[row_index, "final_indices"] <- list(list(as.integer64(used_indices)))
      ## remove the used indices from the found slot
      layout[slot_index, "free_indices"] <- list(list(remaining_indices))
      ## update the free_space colum
      layout[slot_index, "free_space"] <- length(remaining_indices)
    } else {
      layout[row_index, "final_indices"] <- map(layout[row_index, "indices"], 
                                                ~ map(.x, as.integer64))
    }
  }
  ## need this do.call(c, .) construct because an unlist would destroy the integer64s
  layout %>%
    summarize(chk_sum = sum(rep(file_id, files) * do.call(c, final_indices))) %>%
    pull(chk_sum)
}
checksum_defragmented(puzzle_data)

## integer64
## [1] 6427437134372