The Solar Oracle Walkman: Difference between revisions

From SGMK-SSAM-WIKI
Jump to navigation Jump to search
mNo edit summary
Line 113: Line 113:


==RAVE as a Mediator for a Trustable Oracle Space and a Generative Music Engine==
==RAVE as a Mediator for a Trustable Oracle Space and a Generative Music Engine==
PCA is only a linear dimensionality reduction method; it cannot guarantee reproducibility under varying illumination nor provide error correction for binary keys. For this reason, a fuzzy extractor is employed: the continuous latent vector z is transformed into an error-correctable, verifiable bitstring, yielding a stable key K for identity reproducibility and on-chain anchoring.
PCA is merely linear dimensionality reduction; it cannot guarantee "reproducibility under varying illumination" or "error correction for a binary key." Hence this design uses a fuzzy extractor: it converts the continuous latent vector z into an error-correctable, verifiable bitstring and outputs a stable key K, meeting both identity reproducibility and on-chain requirements.
In Google Colab, a dedicated I–V encoder is trained (since the RAVE encoder is not suited for curve data, a Conv1D architecture is adopted). The input consists of sequential 7-D voiceprints (FF, Vmpp/Voc, Impp/Isc, Rs*, Rsh*, curvature_sum, area); the output is a latent vector z aligned to the dimensionality of the downstream audio decoder. Training uses triplet loss (same cells pulled together, different cells pushed apart) along with prior matching, so that z follows the Gaussian prior expected by the decoder. Measurement noise and illumination drift are handled through data augmentation (gain scaling, small noise, temporal jitter).
We trained a custom I–V encoder in Google Colab (since the RAVE encoder cannot directly handle curve data, we trained a Conv1D architecture). The input is a continuous seven-dimensional "voiceprint" (FF, Vmpp/Voc, Impp/Isc, Rs*, Rsh*, curvature_sum, area); the output is a latent vector z whose dimensionality aligns with the downstream audio decoder. Training uses triplet loss (pulling samples from the same cell together and pushing different cells apart) plus prior matching so that z follows the decoder’s Gaussian prior. Measurement noise and illumination drift are addressed via data augmentation (gain scaling, small noise, temporal jitter).
Because z is continuous and slightly noisy, fuzzy extraction is required: z is normalized and quantized, then passed through an ECC with helper data to derive a stable key K. Finally, the on-chain commitment is computed as keccak256(K || salt). Upon enrollment, a pseudonymous index is generated as panel_id = keccak256("panel-id" || K) for archival; at each subsequent submission, a new K′ is reconstructed and keccak256(K′ || salt) is computed. The contract automatically assigns the record to the corresponding panel_id, eliminating the need for manual IDs.
Because z is continuous and slightly noisy, we apply a further fuzzy extraction: z is first normalized and quantized, then passed through ECC to derive a stable key K, and finally we compute the on-chain commitment keccak256(K || salt). At first registration, we generate a pseudonymous index panel_id = keccak256("panel-id" || K) for cataloging; thereafter, each report need only reconstruct K' and compute keccak256(K' || salt), and the contract will automatically attribute the record to that panel_id, with no need to upload a human ID.
The audio decoder is trained separately on musical material (or an existing RAVE decoder is adopted), and its role is simply to map z → audio. Since the encoder already ensures stable and geometrically consistent z, repeated measurements of the same cell yield reproducible timbre and dynamics. This constitutes cross-modal mapping (I–V → audio latent): semantic order is preserved on the encoder side, while the decoder-side mapping is determined purely by artistic or compositional choices (e.g., routing z to AM/FM, filters, distortion, spatial effects), aiming for musicality rather than identity.
The audio decoder is trained independently on musical data (or we simply adopt an existing RAVE decoder); its sole job is to sonify z. Because the encoder enforces stability and geometric relations in z, repeated measurements of the same cell yield reproducible timbre and dynamics. This is a cross-modal mapping (I–V → audio latent): semantic order is maintained only on the encoder side, while the decoder-side mapping is entirely determined by artistic/compositional context (for example mapping to AM/FM, filtering, distortion, spatial parameters, etc.), with the goal being auditory/musical qualities rather than identity verification.
Overall, RAVE (more precisely, the I–V encoder plus the RAVE decoder) functions as a mediating layer. On one side lies the Verifiable Semantic Space (VSP): physical I–V measurements are embedded through contrastive learning into clustered and separated geometries, then passed through a fuzzy extractor to produce K and its on-chain commitment. As noted by Jha et al. (2025), semantic stability depends on three constraints: reconstruction — representations can be mapped back to their source; cycle-consistency — round-trip mappings preserve meaning; and vector-space preservation (VSP) — pairwise distances between embeddings remain stable after mapping. On the other side lies aesthetic generation: a stable z produces reproducible sonic styles and narratives. Philosophically, this does not equate the two media; rather, it anchors generation to an oracle, allowing aesthetics to unfold upon a physically verifiable substrate. RAVE thus becomes a translation membrane: its inner layer preserves worldly geometry and commitments (identity and causality), while its outer layer releases perceptible semantics and sound (aesthetics and expression). Verification is handled on-chain, while computation and sound generation remain off-chain—ensuring DSSC identity security while preserving artistic freedom.
Overall, RAVE—more precisely, the combination of the I–V encoder and the RAVE decoder—serves as an intermediate layer. On one hand, it establishes a Verifiable Semantic Space (VSP): physical I–V measurements are embedded via contrastive learning into clustered yet separated geometric structures, and the fuzzy extractor yields the key K and the on-chain commitment. According to Jha et al. (2025), semantic stability relies on three core constraints: reconstruction — the transformed representation can be mapped back to its source; cycle-consistency — round-trip transforms preserve meaning; and vector-space preservation (VSP) — pairwise distances among embeddings remain preserved after mapping. On the other hand, it supports aesthetic generation: stable z values produce reproducible sonic styles and narratives. Philosophically, this does not equate the two media; rather, it anchors generation in an oracle, allowing aesthetics to unfold on a foundation of physical verification. Thus, RAVE becomes a "translational membrane": the inner layer preserves the geometry and commitments of the real (identity and causality), while the outer layer releases perceivable semantics and sound (aesthetics and expression). On-chain handles verification; off-chain handles computation and generation — a clean division of labor that both secures DSSC identity and preserves artistic freedom.


<gallery widths=500px heights=250px>
<gallery widths=500px heights=250px>

Revision as of 13:52, 4 September 2025

太陽能神喻隨身聽(chinese version)

Abstract

The Solar Oracle Walkman is an exploration of energy trading and sound sculpture. Its exterior references the retro Sony WM-F107, while the device measures the I–V curve of a 6 × 6 cm handmade, artistically patterned DSSC whose TiO₂ porous layer is produced by cyanotype or screen printing. Each “solar mini-disc” yields a unique I–V voiceprint that is sent—via an oracle, a mechanism for securely bridging off-chain data to the blockchain—to a smart contract for verification. Conceptually, the Walkman operates like a cold wallet: each DSSC is a physical token, and the built-in I–V tester is its reader. Upon verification, the Walkman plays generative, semantically constrained music; the on-chain verdict gates playback. In the current Max/MSP prototype, the measured I–V curve is decomposed into seven dimensionless features [FF, Vmpp/Voc, Impp/Isc, Rs*, Rsh*, Σκ, A*], optionally reduced with PCA, then manually mapped to the latent inlets of an nn~ RAVE decoder, achieving reproducible sonic identity without an explicit semantic structure. Next, we will record continuous I–V data under varied illumination and train a RAVE encoder to learn compact, robust latent embeddings of each cell; these embeddings will feed a fuzzy-extractor pipeline (quantization → error correction with helper data → hash) to derive a stable key. On-chain we anchor only a commitment to that key, preserving privacy while enabling verification. With appropriate vector-space preservation, distances in latent space will reflect differences in photovoltaic behavior, allowing the device to act as a “divinatory machine” that links matter, perception, and imagination.

Experiments

The solar oracle walkman is mainly made of 3 components: a I-V curve tester, a patterned solar mini disc and a smart contract. The I-V curve of each solar mini disc is measured and uploaded to a smart contract deployed on Sepolia Testnet to be verified, once its I-V data passes the verification, the corresponding music will be generated and allowed to play from the walkman accordingly. The sound of each "solar mini disc" are expected to be reproducible, generative and semantic, like a period of generative music with clear mechanism rather than completely randomness. To make each solar mini disc a generative device, I firstly assume I need to design a hash operation to gain a “ voiceprint (V)” for each solar glass; A hash operation is the process of feeding input data such as numbers, text, files, or a set of I-V curve parameters—into a mathematical function or algorithm to produce a hash value. Hash algorithms can take input of any length but always generate a fixed-length output. They are designed to be fast to compute, yield the same output for the same input, and produce drastically different outputs when the input changes even slightly.

 '''Solar Oracle Walkman v1 — Overview'''
 
 [Light]
    ▼
 [Patterned DSSC “Mini-Disc”]
    ▼
 [I–V Scanning / ESP32-S3 Tester]
    ▼
 [Feature Extraction, 7D]
 F = [FF, Vmpp/Voc, Impp/Isc, Rs*, Rsh*, Σκ, A*]                    
    ▼     
 ml.scale normalization            
    ▼
 ml.principle (PCA)                 
    ▼
 input RAVE nn~ decoder   
    ▼
 Real-time audio output
 Cross-modal mapping: I–V latent → audio latent
 (mapping chosen by artistic / compositional context)      

 '''Solar Oracle Walkman v2 — Overview'''
 
[Light]  
    ▼
[Patterned DSSC “Mini-Disc”]
    ▼
[Continuous I–V Scanning / ESP32-S3 Tester]
    ▼
[I–V Encoder (Conv1D trained in Colab)]
Input: 7-D sequence [FF, Vmpp/Voc, Impp/Isc, Rs*, Rsh*, curvature_sum, area]
Output: latent vector z = (z1, z2, …, zn)
Training: triplet loss + prior matching
Augmentation: gain scaling / noise / temporal jitter
    ▼
 [Fuzzy Extractor]
z → normalization / quantization
  → ECC + helper data → stable key K
  → commit = keccak256(K || salt)
Enrollment: panel_id = keccak256("panel-id" || K)
Verification: new K′ → keccak256(K′ || salt) → match on-chain
    │
    ├───────────────► [Oracle / On-chain Path]
    │                   Package {pubkey, panel_id, commit}
    │                   POST to Oracle API
    │                   Smart contract verifies & records
    │                   Feedback: OK / FAIL
    │
    ▼
input RAVE nn~ decoder
    ▼
Real-time audio output
Cross-modal mapping: I–V latent → audio latent
(mapping chosen by artistic / compositional context)

The first prototype: the 7-D voice print and the fuzzy extraction of I-V curve

A DIY I-V curve tester is connected to computer and the 16 points of I-V curve measurements are sent to the Max/MSP via serial communications. I-V curve is often used to analysis the characteristics of a solar cell, therefore it is ideally the "voiceprint" of the panel, especially the DSSC with cyanotyped and screen printded TiO2 layer. In this research, the shape of I-V curve is deconstructed in:to seven features that are often used to measure different characteristics of the panel, and then apply machine learning to each feature so the shape can be learned by the computer. This method is expected to ensures the irradiance invariance, so the reproducibility of the audio output of the solar mini disc will be resilient even it's put under different light exposure. The voiceprint V consists seven features of the I-V curve: V = [FF (Fill Factor), Vmpp/Voc, Impp/Isc, Rs (series resistance), Rsh (shunting resistance), sum of curvature, total area of the I-V curve]. Noticing the calculation made here are dimensionless. A dimensionless feature vector is a set of numerical descriptors that have been normalized so they no longer carry physical units such as volts, amperes, or ohms. By converting raw measurements into dimensionless quantities—for example, by taking ratios like Vmpp/Voc or Impp/Isc, the features capture only the relative shape or behavior of the data, independent of its absolute scale. This process is crucial when comparing or classifying I-V curves under varying light intensities, as it ensures that differences in the vector reflect intrinsic device characteristics rather than changes in measurement conditions. The feature definitions (scale-free) are listed below:

The 7-D voiceprint is defined as: V = [FF, Vmpp/Voc, Impp/Isc, Rs*, Rsh*, Σκ, A*] All features are computed on a 64-point resampled I–V trace and normalized by Voc and Isc to be invariant to irradiance and device size.

FF (fill factor)
FF = (Vmpp * Impp) / (Voc * Isc)
Vmpp/Voc and Impp/Isc
Scale-free ratios capturing the operating point at maximum power.
Rs* and Rsh* (dimensionless ohmic estimates)
First estimate the local slopes on the resampled curve:
Rs ≈ -ΔV/ΔI (evaluated near I ≈ Isc)
Rsh ≈ -ΔV/ΔI (evaluated near V ≈ Voc)
Then report dimensionless forms:
Rs* = Rs * (Isc / Voc)
Rsh* = Rsh * (Isc / Voc)
Σκ (curvature_sum)
Sum of absolute turning angles along the 64-point polyline of the I–V trace: for each consecutive pair of segments s_i = (ΔV_i, ΔI_i), accumulate
|angle(s_i, s_{i+1})|, and report Σκ = Σ |angle(s_i, s_{i+1})|.
(Intuition: larger Σκ indicates a more “bent” I–V shape.)
A* (normalized area under the I–V curve)
Definition: area from V=0 to V=Voc divided by (Isc * Voc).
Discrete approximation on the resampled trace:
A* ≈ (Σ I[i] * ΔV[i]) / (Isc * Voc)

To make the sound of every solar mini disc reproducible and solid for smart contract verification, ml.* library in Max/MSP is a solution. Ml.* is a toolbox of machine learning algorithms implemented in Max to enable real-time interactive music and video with unsupervised machine learning, aimed at computer musicians and artists. The raw seven features are first sent to ml.scale object for the normalization in range from 0 to 1. The values are then passed to ml.principle, which performs Principal Component Analysis (PCA). This converts the seven values into a new 7-dimensional PCA space which is a mathematical method that rotates and compresses data into fewer dimensions while preserving as much variance as possible. ml.principle is the Max/MSP object that implements PCA: it learns the principal axes from training data, and then projects new data into that reduced space. I am not familiar with how fundamentally the mathematics works, however, I got an okay explanation from GPT below in the photo gallery.

RAVE as a Mediator for a Trustable Oracle Space and a Generative Music Engine

PCA is merely linear dimensionality reduction; it cannot guarantee "reproducibility under varying illumination" or "error correction for a binary key." Hence this design uses a fuzzy extractor: it converts the continuous latent vector z into an error-correctable, verifiable bitstring and outputs a stable key K, meeting both identity reproducibility and on-chain requirements. We trained a custom I–V encoder in Google Colab (since the RAVE encoder cannot directly handle curve data, we trained a Conv1D architecture). The input is a continuous seven-dimensional "voiceprint" (FF, Vmpp/Voc, Impp/Isc, Rs*, Rsh*, curvature_sum, area); the output is a latent vector z whose dimensionality aligns with the downstream audio decoder. Training uses triplet loss (pulling samples from the same cell together and pushing different cells apart) plus prior matching so that z follows the decoder’s Gaussian prior. Measurement noise and illumination drift are addressed via data augmentation (gain scaling, small noise, temporal jitter). Because z is continuous and slightly noisy, we apply a further fuzzy extraction: z is first normalized and quantized, then passed through ECC to derive a stable key K, and finally we compute the on-chain commitment keccak256(K || salt). At first registration, we generate a pseudonymous index panel_id = keccak256("panel-id" || K) for cataloging; thereafter, each report need only reconstruct K' and compute keccak256(K' || salt), and the contract will automatically attribute the record to that panel_id, with no need to upload a human ID. The audio decoder is trained independently on musical data (or we simply adopt an existing RAVE decoder); its sole job is to sonify z. Because the encoder enforces stability and geometric relations in z, repeated measurements of the same cell yield reproducible timbre and dynamics. This is a cross-modal mapping (I–V → audio latent): semantic order is maintained only on the encoder side, while the decoder-side mapping is entirely determined by artistic/compositional context (for example mapping to AM/FM, filtering, distortion, spatial parameters, etc.), with the goal being auditory/musical qualities rather than identity verification. Overall, RAVE—more precisely, the combination of the I–V encoder and the RAVE decoder—serves as an intermediate layer. On one hand, it establishes a Verifiable Semantic Space (VSP): physical I–V measurements are embedded via contrastive learning into clustered yet separated geometric structures, and the fuzzy extractor yields the key K and the on-chain commitment. According to Jha et al. (2025), semantic stability relies on three core constraints: reconstruction — the transformed representation can be mapped back to its source; cycle-consistency — round-trip transforms preserve meaning; and vector-space preservation (VSP) — pairwise distances among embeddings remain preserved after mapping. On the other hand, it supports aesthetic generation: stable z values produce reproducible sonic styles and narratives. Philosophically, this does not equate the two media; rather, it anchors generation in an oracle, allowing aesthetics to unfold on a foundation of physical verification. Thus, RAVE becomes a "translational membrane": the inner layer preserves the geometry and commitments of the real (identity and causality), while the outer layer releases perceivable semantics and sound (aesthetics and expression). On-chain handles verification; off-chain handles computation and generation — a clean division of labor that both secures DSSC identity and preserves artistic freedom.

Smart Contract Implementation

The Solar Oracle Walkman project includes a blockchain-based smart contract that validates and permanently stores IV voiceprint data from handmade DSSCs on the Ethereum network. Deployed on Sepolia testnet at address 0xeF19a90e5786dd0e89264F38f52CF81102db938e, the contract functions as a decentralized digital notary that verifies the authenticity of IV characteristic measurements through advanced security validation rules, EIP-712 signatures, and comprehensive data integrity checks. This immutable system ensures that each DSSC's unique electrical fingerprint can be cryptographically verified and stored permanently, creating a tamper-proof record of the device's performance characteristics.

Oracle as Bridge in the Solar Oracle Walkman: a Minimal Cognitive Generative System

Perception and machine learning can both be read as generative: they predict first, then correct. In Solar Oracle Walkman, the patterned DSSC’s I–V curve is measured and mapped into a latent state for sound, while a separate fuzzy extractor derives a verifiable key for on-chain proof. Here the oracle is the bridge: it carries a cryptographic claim (e.g., keccak256(K || salt)) from the physical world to the smart contract, keeping sonic imagination tethered to measurable traces. Following Stinson’s generic mechanism, we make the in-skull/out-skull loop explicit. Out-skull: light, temperature, material; in-skull: models and priors (RAVE encoder–decoder, human listener), and—conceptually—on-chain verification as institutional memory. The Walkman is staged as a minimal cognitive generative system: a tiny theatre where measurement becomes prediction, error becomes style, and verification becomes composition. Notes. RAVE is not the fuzzy extractor. The encoder yields a compact latent z for music; a standard fuzzy extractor (operating on z or the 7-D features) outputs (K, W) for identity stability and on-chain verification.


                  [ GENERIC MECHANISM KIND ]
    (Grounded inference: sense = abstract = validate = act)
                              │
          ──────────────────────────────────────────
         │                                         │    
[ COMPUTATIONAL MODEL ]                    [ TARGET SYSTEM ]
Solar Oracle Walkman                       human cognition
DSSC I–V measurement                       environment / sensory input
encoder + latent z_iv                      predictive priors + latent z_brain
fuzzy extractor -> K                       social/institutional memory
oracle verify (smart contract)             (report/peer/record)
         │                                         │ 
          ──────────────────────────────────────────
                              │
                       decoder│actions
                              │
          [ controlled hallucinations (reality, music) ]

Why oracle matters here. The oracle lets the piece claim: “this sound arose from this cell, now, under these conditions,” without freezing creativity; it anchors a generative act to evidence. The result is a system that is at once playful and accountable. Conservative assessment. This is a performative engineering sketch of cognition, not a testable theory of consciousness; it sits at a respectful distance from formal consciousness science.

Discussion

  1. Where things are now The oracle walkman works as a simple art sculpture that sonifies DSSC I–V curves in real time. The 7-feature voiceprint is stable across illumination changes after normalization. The mapping is deliberately minimal, which makes evaluation of reproducibility straightforward. A controlled pipeline from sensing to sound is established in Max/MSP. Perception and AI are treated as two sides of the same generative mechanism. The working definition of hallucination is generation that drifts beyond admissible evidence and priors. Brains predict and correct; hallucination is an extreme case of prediction mismatch. The oracle provides external anchors to keep generation within verifiable bounds while leaving room for creative variance.
  2. What the theory is doing now Stinson’s generic-mechanism view motivates treating DSSC–RAVE and human perception as different instantiations of a common generative architecture. Feigl’s correspondence model motivates explicit bridges from observation to latent variables, so every design step is tied back to measurable traces. These theoretical lenses are not goals in themselves. They function as design guidelines for dataset building, priors for mapping, and evaluation metrics for drift and variance. Current limitations highlight the absence of vector space preservation (VSP). Without VSP, the latent space serves as a stable registry of identities but cannot guarantee relational meaning across cells. Thus, the oracle functions mainly as a gatekeeper that validates authenticity but offers little semantic interpretation. With VSP, however, the oracle could evolve into a “true oracle machine”: not only verifying truth but also revealing how different energy curves relate, translating physical differences into interpretable structures of another domain.
  3. Next steps Build a small but clean training set of DSSC voiceprints with controlled illumination and temperature, then test monotonicity and local smoothness priors. Prototype vec2vec-style constraints: simple cycle checks and distance preservation on a held-out set; log when sonic neighborhoods fail to match energy-curve neighborhoods. Investigate lightweight inference targets and compression for future mobile use. Explore whether traceable energy records can be registered as verifiable hashes derived from sonic voiceprint, then evaluate failure modes and anti-counterfeiting limits. In this context, “oracle” refers not only to the blockchain bridge for off-chain data, but also resonates with its ancient meaning—an oracular revelation. When DSSC voiceprints serve only for verification, the oracle acts as a gatekeeper; but once endowed with semantic structure, capable of revealing relations among energy curves and translating them into sound space, it transcends verification and functions as a “machine of divination,” converting physical traces into messages from another world.

References

  1. Buckner, Cameron J. 2023. From Deep Learning to Rational Machines: What the History of Philosophy Can Teach Us about the Future of Artificial Intelligence. 1st ed. Oxford University PressNew York. https://doi.org/10.1093/oso/9780197653302.001.0001.
  2. Stinson, Catherine. 2020. “From Implausible Artificial Neurons to Idealized Cognitive Models: Rebooting Philosophy of Artificial Intelligence.” Philosophy of Science 87 (4): 590–611. https://doi.org/10.1086/709730.
  3. Jha, Rishi, Collin Zhang, Vitaly Shmatikov, and John X. Morris. 2025. “Harnessing the Universal Geometry of Embeddings.” arXiv:2505.12540. Preprint, arXiv, June 25. https://doi.org/10.48550/arXiv.2505.12540.
  4. https://www.hackteria.org/wiki/A_RAVE_and_starvation_synth_based_generative_sonic_device_powered_by_dye_sensitized_solar_cell
  5. https://github.com/shihweichieh2023/IVcurve_tester
  6. https://github.com/rjha18/vec2vec
  7. https://github.com/shihweichieh2023/solar-oracle-walkman