/images/avatar.jpg

Crazyfrank's Blog

Understanding Decentralization Through Log Processing

This article explores decentralized system design concepts and implementation approaches through the lens of log processing.

Overview of Decentralized Systems

Decentralization is a key concept in distributed system design that improves system reliability and fault tolerance by distributing control across multiple nodes.

In architecture, “decentralization” doesn’t necessarily mean “completely no center,” but rather:

  • Avoid irreplaceable single points of failure by using distributed, peer-to-peer nodes to share responsibilities
  • System availability and scalability don’t depend on any single critical node

Log Processing

Taking the ELK log processing architecture as an example:

Some Thoughts on DDD

Project Introduction

I’m writing this article mainly to share some learning insights from the project practice of DDD. Since I’m too lazy to read the theory, I’m just guessing how DDD should be designed. For more details, please refer to the DDD official website: DDD Concept Introduction

Recently, ByteDance has open-sourced their open-coze. The front end is React + TS, and the back end uses Go. Since I didn’t use Py or TS, and I have a bit more knowledge of Go, I started learning it.

Logs and Errors

Logs and Errors

Generally speaking, in our project, if we strictly divide it according to a hierarchical structure, it is roughly handler, service (which is further subdivided into application in DDD), and domain, dao.

For the dao layer, if an error occurs, we deal with it by directly up-throwing and selectively logging the error, for example, there may be some interfaces that involve parameter validation at the dao layer, manually managing transactions, and so on, which require logging, but the error is still directly up-throwing.

DB-Cache Consistency Problem

How to ensure the consistency between cache and database is a topic that has been discussed over and over again.

But many people still have a lot of doubts about this issue:

  • Should the cache be updated or deleted?
  • Should I choose to update the database first and then delete the cache, or to delete the cache first and then update the database?
  • Why introduce message queues to ensure consistency?
  • What problems may arise from delaying double deletion? Should we use it or not?

Introducing cache improves performance

Let’s start with the simplest scenario.

Load Balancing

Overview

In a distributed environment, each microservice will have different instances. Service registration and service discovery solve the problem of “what are the available instances”, and the remaining question is “who should I send the request to with so many available instances?”. The question that remains is, “With so many available instances, who do I send the request to? Intuitively, most people, if they have heard of some specialized terminology, will directly think of “load balancing”. What exactly is load balancing?

MVCC and MySQL Logs

Under the MVCC mechanism, Redo Log and Bin Log are mainly useful at transaction commit time and their roles and trigger timing are as follows:


Logging behavior during transaction execution

When executing:

UPDATE users SET age = 26 WHERE id = 1;

MySQL’s transaction execution order (combined with MVCC + logging) is as follows: